rel=prev and rel=next/how to sabotage a standard

Some specs are easy. You read them, you understand them and you implement them in less than 5 minutes. And then someone comes along and fucks it all up. I've been reading up on the rel attribute, more in particular the use of rel=prev and rel=next. No doubt one of the easiest parts of any spec out there, until I heard what Google planned on doing with that information. Those of you who've already implemented these attributes, better think twice before keeping them on.

web and net

As a front-end developer I have spent most of my time defining structure and semantics of content on a single page. The past few years I've tried to create consistency in my html components between different pages and even websites, but even then I was still focused on describing content that resided within single documents. There is more to the web than just displaying information though.

The main strength of the internet lies in linking documents together, to create a real web of information. So far we didn't have many means to describe how pages were related to each other, the rel attribute was conceived as a first step to change that. The rel attribute accepts a string of keywords that gives extra information on a specific link, explaining the relation between these documents. While some of its functions are questionable at best (nofollow for example, which clearly doesn't describe a relationship but instead describes an action), it opens up a whole world of interesting possibilities.

prev/next

A sequence of documents is one where each document can have a previous sibling and a next sibling. A document with no previous sibling is the start of its sequence, a document with no next sibling is the end of its sequence.

whatwg

I probably don't even have to explain what rel="next" and rel="prev" are really for, as "prev" and "next" are common keywords in whatever pagination variant you can think of. rel="prev" indicates a link to a document that belongs to the same sequence and precedes the current document, rel="next" indicates a link to a document that follows the current document. It's as simple as that and that's all there is to it really.

If you follow the whatwg spec the prev/next values can be placed on all types of pagination, ranging from multipage articles, paginated result lists (on both prex/next keywords and the appropriate numerical links) to the next/prev links you'll find at the bottom of this article, used for jumping directly between blog posts.

Implementing this is also as easy as can be, but before I was going to tamper with my blog I looked around one last time, a little weary of the simplicity of this particular spec.

enter google

Send users to the most relevant page/URL - typically the first page of the series.

For years now we've been adding semantics and structure to our documents so automated systems would know what to do with our information. So far not many systems out there use this semantical data, so there has been little feedback on how this would actually work in the real world (I know there are various attempts by Google to match microformats and such, but I consider those implementations to be rather marginal compared to what you would normally describe as common use on the web). We've been so occupied with finding the best way to do this, that we somehow forgot that there would be a possibility for those automated systems to willingly misinterpret this semantical data, or at least interpret it differently as we originally intended.

The quote above is how Google hopes to interpret the prev/next values for its search engine, hinting that it will try to redirect people to the first page of the sequence if it thinks this is appropriate. Looking back at the pagination examples I've given though, this is definitely not what I would consider preferred behavior. Not as a site owner, but also not as a user of the Google search engine. Worst case this would mean that if Google found a match in one of my latest articles, it would throw the person back to the first article I've ever written. Or if it would find a hit in a result list, it would send you back to page 1 of the results. How this is useful is beyond me.

What Google tries to do is catch those instances where one single article is spread across multiple pages, but even then it's a questionable assumption that people would prefer to start at the beginning of the article, rather than get to the bit that matched their search and go from there. It's nice that Google tries to be helpful, but they should take care not to hurt, hamper or hollow out the initial goals of a spec.

conclusion

It's a little scary to think that one company (~85% of the world population searches the web with Google) can make such a trifle assumption and make a simple, clear cut spec like this virtually unusable. Maybe I'm jumping to conclusions here, as Google didn't actually reveal its algorithm for deciding on the automated content jump, but as it stands now I'm not going to implement the prev/next values as I believe this will actually hurt the resulting hits people will receive on my site.

I'm sure the option to jump directly to the first page would be handy (if it isn't already available on the page itself), but when big companies start deciding what content to serve me (hello there Facebook) rather than just offer what is out there, things get a little iffy. I do hope Google retracts its decision to act on the rel="prev/next" spec the way they described above because it's a nice spec with a lot of potential.