html5 section and article

If you follow this blog you know that structural significance of html is one of my pet peeves. Some time ago I wrote about the addition of the header and footer elements in html5, now it's time to get really down to business. With my ie6 graph continuing its downward spiral I believed it time to rework the html code for this blog from the ground up, leaving out all the usual ie6 restrictions and integrating as much html5 as possible. This revealed some interesting structural challenges.

no more h2-h5

One of the coolest changes in html5 is the way headings are handled. I complained about crappy heading handling of (x)html before, html5 brings forth a new era of headings. From now on we can (and probably should) use only one heading element (h1 - note that you can use it more than once on a page though) and leave the rest of the structuring to the html outline itself. Not minding the SEO implications for a second, this is without a doubt the best way forward from a html/css perspective.

This way of working makes it a lot easier to syndicate content (no more worries about heading hierarchies when a particular component is dropped into a different context) but puts a lot more strain on the html structure itself. You can't simply trust on nesting depth to compute the level of a heading so you need a different mechanism to construct the hierarchy outline of your document. That's where the new section and article elements come in. There are a few additional elements with sectioning powers but their scope is smaller and more tied to semantic meaning rather that structural power, so for now I'll leave them be.

article element

The article element represents a component of a page that consists of a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable, e.g. in syndication.

The article element was introduced to wrap content which still make sense on its own when syndicated. Think of a blog post, a latest news list or an event calendar. These components can be taken out of their original context and can freely exist within a different context without losing any of its meaning. Sounds clear enough, but "syndication" remains a somewhat vague indicator. Some people say that a single comment on a blog post could be wrapped in an article element (because some blogs offer rss feeds with separated blog comments), whether you agree with this line of thinking is entirely up to you.

I believe (for now) that content wrapped in an article needs to make sense all by itself. A single comment does not as it is part of a conversation or relates to the article where it was posted. Your mileage may vary though and it's still too early for best practices (not enough practice yet I guess), so it's really up to you to decide the best way to make use of the article element.

section element

The section element represents a generic section of a document or application. A section, in this context, is a thematic grouping of content, typically with a heading.

According to the specs a section is meant to wrap a generic content part. If you ask me, that's a pretty generic description making it look a lot like a regular div element. But the spec goes on to state that a section will usually contain a natural heading, limiting the scope of the element considerably. It places its use rather on component level rather than molecular level.

So section elements are for wrapping a selection of content that belongs together, can be given a natural heading but does not quality as an element that can be syndicated. If that's the case you're better of using the article element. Note that article elements can be nested inside section elements and section elements can be nested inside article elements. It all depends on whether a particular block of content is viable for syndication, so no real hierarchy exists between these two elements.

document outliner

The theory is relatively easy but when actually sitting down there's a lot of pondering and weighing to do. Trying to find best practices and workable rules takes time, luckily there are a couple of tools that might help you on your way, if only a little. With the html5 spec not finished it's impossible to find anything definite but currently this html5 outliner seems to be considered one of the best ones around.

These outliner tools allow you to upload your html after which an outline of your document is returned. This makes it a lot easier to check whether your document sectioning makes any sense and what areas are up for improvement. Think of it as an automatic table of contents generator for you html document. Without it, you're pretty much left to yourself battling the somewhat cryptic and elaborate rules that currently exist for sectioning html documents.

conclusion

With the addition of the section and article elements the w3c is once again stressing the need and importance of structural relevance of html documents. In short, don't remove html elements ("wrappers") simply because they are not needed for styling or because they don't bring any semantic value to the table. Remember the structural value of html and use it to improve the quality of your online documents.

I'm sure it will take a lot of time and debate to come up with some decent best practices, but at least we're given some useful tools to get started. If all goes well I will implement the new code by the end of this year, though that might be a little bit too optimistic. I'll keep you posted.