over-semantic/a bridge too far

With the coming of html5 a whole new road of semantic possibilities was revealed to us, eager front-end developers. New elements and redefined elements gave us more tools to express ourselves in the web's native language. The fine people of the w3c did a pretty good job, but with their heads so high up in the html clouds some details were bound to be a little off. Following up one of my older articles (xhtml bad boys), let's face the dangers of the over-semantic web.

going to the doctor

Earlier this week the brave team of html5 doctors asked themselves (and its reader base) how to mark up the author spec of a comment. They presented their readers with 5 choices and asked for feedback. A great platform for discussion and a perfect opportunity to get acquainted with the obscurer parts of the html5 spec.

There were three basic options (add two variants to make five in total), one using the cite element, another using the address element and one final option using neither of both. The comment section of said article is a true treasury of front-end wizardry, allowing a brief yet enlightening glimpse into the minds of fellow artisans. And while I'm not in the position to claim absolute truth, nor in the position to openly criticize individual opinions, I will argue that these comments illustrate how easily rules are bent and how common sense is often discarded simply in the name of supposed semantics.

the options

I suggest reading the article + comments first before coming back to finish this write-up. I won't be repeating everything said on the html5 doctor's site and I'll be assuming you're up to date on what was being discussed over there. So let's have a quick glance at the available options then.

the cite-element

For one moment ignoring the discussion whether a cite-element can be used to mark up a person's name (weird rule, weird discussion?), a comment is simply not a citation. It's original content left by the author on a website. No matter what other meanings or use cases the w3c might have come up with for the cite element, it would go against common sense to use it for anything else than referencing citations.

This discussion reminds me a little of the time when everything suddenly became a list item. Even articles and blog posts were defined as a list of paragraphs. If we start wrapping comment authors in cite-elements, should we be wrapping our own author credits in cite-elements too? A road I don't want to investigate.

the address-element

I've complained about the address tag before, but it seems for html5 the element was slightly redefined. Those of you praying we could finally use it for actual addresses, don't get your hopes up. The element can now be used within a section to indicate the contact info of the author of that particular section. This sounds like a good use case for our comment author (which usually features a link to his website), but there's an interesting catch.

If you're using the address-element you have to make sure you're talking contact information. It would be valid if you'd put the email address of the commenter in the link, but we all know this goes against best practice. Instead we're left with the url to a website. There is no guarantee the commenter can be contacted through this website, there's not even any sort of guarantee the commenter has a link with the given website. So in the end, using the address-element here is not a good option either as it could and often would be based on incorrect assumptions.

author information in the footer?

Somewhat surprisingly one of the suggestions featured the author information inside a footer-element. While the w3c guidelines seem to indicate this is a good (the best?) option, common sense will tell you once again this is utter bollocks. Author information should be available before the main content is given so readers can use this as additional context when interpreting the text that follows.

A medium-positive comment can be considered extremely positive if the commenter is known to be inhumanly critical, or it can be considered quite negative when coming from a raving type. Providing the author information in a footer element is definitely not the way to go. Of course you could place the footer element structurally first in your section, but that would be equal to turning the world upside down. Another road best left abandoned.

conclusion

In the comment section there is one opinion stating the option without extra semantic mark up is a semantically light. This is true of course, but lacking better options it still beats adding incorrect or questionable semantics. Doing this is by no means an improvement over the simple solution and should even be considered harmful to the quality of our work.

It's not because we have new elements to toy around with that we have to start bending ourselves in all different kinds of positions just to make sure we use them. When the situation calls for it and you can properly use the address or cite element, go right ahead (though I still object against the current use of address), but if there's no fitting element, just leave it at that. Maybe html6 will fix it for you later on. If not, in five years time we'll be dealing with cow paths that make little or no sense at all.

Disclaimer-wise: this is not meant as personal critique to any of the commenters, nor to the people who wrote the html5 spec. I fully realize how easy it is to get caught up in the semantic spiral. But a word of warning is definitely in place here.