Author Archives: Dan Siemon

Blogs, search engines and WordPress

One problem with the blog format is that the same content can show up on several URLs. This content layout is nice for humans. In the case of my blog, the same post content can show up on the main page, a category URL and an archive URL.

Unfortunately, what is convenient for humans is not so good for search engines. There are two aspects of the standard blog format which cause search engine problems. The first is the dynamic nature of some blog URLs. Consider the main page of an active blog. Most only show about ten posts; older posts are removed as newer ones are created. Often this results in a particular URL not containing the content the search engine thinks it does. Personally, I find this incredibly annoying since I often have to search the site using a local search engine after Google has directed me to the main page of a blog. The second problem of the blog format with respect to search engines is that some URLs, like a category URL, contain many posts which are not directly related to a particular search. This results in having to search the page with the browser’s find function after the search engine gets you there.

Both of these problems have been annoying me for some time now. So today I did a little digging. Fortunately there is a solution, the robots meta tag. This tag specifies, on a page by page basis, whether or not the content on the current page should be indexed by the search engine and if links on the current page should be followed.

The solution then, is simple. URLs which contain multiple posts should be marked “noindex,follow” while individual posts should be marked “index,follow”. This should result in the content of each post only being in the search engine database once. I also found a post called A critical SEO Tip for WordPress which describes a way to accomplish this in WordPress. The slightly modified version of this solution which I have added to my WordPress theme’s header.php is below. Unless there are downsides to this approach that I don’t know of, I think every theme author should add something like this to their theme.

<?php
if (is_single() || is_page() || is_author()) {
echo "<meta name=\"robots\" content=\"index,follow\"/>\n";
} else {
echo "<meta name=\"robots\" content=\"noindex,follow\"/>\n";
}
?>

Isn’t it semantic?

Isn’t it semantic?: An interview with Sir Tim Berners-Lee. There are some interesting comments on the semantic web in this interview.

In physics, to take the behaviour of gases as an example, you visualize them as billiard balls, model the rules they follow and then transpose that to a larger scale to account for the effects of temperature and pressure – so physicists analyze systems. Web scientists, however, can create the systems.

So we could say we want the Web to reflect a vision of the world where everything is done democratically, where we have an informed electorate and accountable officials. To do that we get computers to talk with each other in such a way as to promote that ideal.

OpenSPARC

Sun sure is doing some really interesting things these days. First they release a chip with 8 cores, each capable of running 4 threads simultaneously. Now they have released the design of the chip under the GPLv2.

Speculations on the future of science

Speculations on the future of science by Kevin Kelly

Science will continue to surprise us with what it discovers and creates; then it will astound us by devising new methods to surprises us. At the core of science’s self-modification is technology. New tools enable new structures of knowledge and new ways of discovery. The achievement of science is to know new things; the evolution of science is to know them in new ways. What evolves is less the body of what we know and more the nature of our knowing

Linux Journal’s new editor

So my favourite magazine, Linux Journal, has a new editor. Nicholas Petreley.

I have been a Linux Journal subscriber for 8+ years and I proudly have every issue on my bookshelf. I even paid for a subscription for my favourite computer store to help them gain knowledge about Linux and FOSS.

It used to be that the final page of Linux Journal had good information; news from the community, law advice etc. Now that Petreley has joined, the last page of my favourite magazine has uninformed rants that at best belong in a Slashdot comment on a KDE vs GNOME story.

I can only imagine what people new to the community will think when they pick up their first issue of Linux Journal and see that the writing style typified by Slashdot comments also makes it into the community’s print publication.

I will reserve my judgement on the article content for a couple of more issues since the articles that have been published so far were quite likely in the pipeline before Petreley got involved. However, I seriously doubt that Petreley’s biases will not bleed into the rest of the magazine.

On the plus side, the new larger, more graphical layout is quite visually appealing. To whatever extent Petreley was involved in the graphic design changes I compliment him and the rest of the Linux Journal team. Too bad the new layout does not make up for the loss in editorial quality.

Interview with Van Jacobson

TCP/IP pioneer’s past is prologue from EETimes.

EET: And though packets declared victory over circuits, there seems to be renewed interest in giving IP as many circuit-like characteristics as possible.

Jacobson: I hope that the circuit obsession is transitional. Anytime you try to apply scheduling to a problem to give latency strict bounds, the advantages are not worth the cost of implementation. Strict guarantees gain you at best a 100-microsecond gain in networks, where the intrinsic jitter in the thermal conditions of the planet is 300 microseconds.

EET: So all the late-1990s studies of QoS involved people speaking different languages, coming from different perspectives.

Jacobson: QoS has been an area of immense frustration for me. We’re suffering death by 10,000 theses. It seems to be a requirement of thesis committees that a proposal must be sufficiently complicated for a paper to be accepted. Look at Infocom, look at IEEE papers; it seems as though there are 100,000 complex solutions to simple priority-based QoS problems.

The result is vastly increased noise in the signal-to-noise ratio. The working assumption is that QoS must be hard, or there wouldn’t be 50,000 papers on the subject. The telephony journals assume this as a starting point, while the IP folks feel that progress in QoS comes from going out and doing something.