in defense of semantic value

In case you hadn't noticed yet, this week two separate articles appeared on Smashing Magazine related to semantic importance. One of them titled our pointless pursuit of semantic value, the other one pursuing semantic value. The contents of these articles speak for themselves and I don't plan on joining the discussion directly, I have two important observation to share though, which can be nicely bundled into my 2-in-1 rant on semantic value.

rant 1: why microformats and html5 microdata (kinda) suck

To understand the current lack of semantics on the web, it's important to know how and why semantics should matter. Currently the goals of semantic value can be summarized into two main selling points: findability and processability.

Findability probably speaks for itself. When I ask google to look for a specific film review, I want google to return actual reviews. I don't want a page featuring the film's title and a greyed-out review link (because no reviews are available). That's exactly the opposite result of what I asked google to find me. So semantics should make it easier for google (and I'm talking all search engines of course) to determine the actual contents of a document/component and provide better search results.

Processability is a little different. It doesn't just stop with finding content, it wants to recognize content and offer a gateway to export it in different formats. That's what the most popular microformat (vCard) is doing right now. It makes sure the hCard spec can be recognized in html documents by external software, which can then automatically import all the data into a different piece of software or export it in a different format. In this case, an automatic way to sync data between two different systems (website and agenda).

One of these tasks is infinitely more difficult than the other. For findability, you need one single marker on the base tag of the component (.review), for processability you have to define all the separate sub components and make sure they can be processed correctly. Looking back at how web design grew up, there has been one big constant: baby steps are the way forward. Start out simple, maximize profits with minimal effort and once you have established popularity, extend and build on that. And that's exactly what initiatives like microformats and microdata failed to see. Implementing findability support is easy and trivial, but by trying to tackle the full picture all at once that step was skipped and forgotten.

Everyone who ever tried to implement a microformat should be aware of its complexity. And not only for us, the front-end guys, it also requires extra effort from the back-end team who has to develop the correct code in whatever cms they are working with. This extra step is often too much to incorporate into the project, so we take the easy route. At the same time we see that very little support exists out there for people who do implement microformats and microdata, so the pay-off to go that extra mile remains small.

For findability, all we would need is a fixed vocabulary for popular content types (.product, .review) that can be added together (.product.review). Add synonyms (.post = .article) and you have just about all the power you need to tell search engines what content you are serving. There is still no way to process this information, but that kind of support can be put back a little until the time we do have enough base support.

The main question of course is who would manage this vocabulary list, but that's a more practical consideration I like to leave for others to decide. I'm sure though that the popularity of semantic thinking would rise exponentially if such (very simple) support existed today.

rant 2: the now-generation of web development

You've wasted 40 minutes, with no tangible benefit to show for it.

A fair rule of thumb: when it comes to semantics, if it's confusing enough for you to ask a question about it, chances are the answer won't make a realistic difference.

provide clear evidence that currently semantics do help us, and in the future will help us, solve real problems.

The three quotes above are taken from the posts and comments of the two articles appearing on Smashing Magazine. They are coming from people with considerable weight in the web development community.

The sad thing is that they all talk about the "now". Immediate gain, direct results and measurable effort. Things become worthless or invaluable to pursue if they don't yield immediate results. As our industry grows and matures, it's normal that money (and thus efficiency) is becoming more and more important, but I firmly believe that possible shortcuts should never be preached by those who are (in whatever way) elevated to preach to the masses.

Whether it's worth to go the extra mile to understand semantics that don't "work" today is up to the developer. Making sure that developers understand that increased semantical value will aid us five years from now is up to the preachers. This whole "now"-movement reminds me a lot of why we are still providing ie6 support today, as those sites we also conceived and constructed only with "now" in mind (and they worked damn well in the past "now" too).

People seem to forget about the benefits of theoretical research. My math teacher once told us the story of "i" (the imaginary number). This number was conceived in the 16th century without any possible use for it. Only 300 year later it proved to be incredibly useful research that immediately solved a number of problems people were facing back then (fe. electrical engineering). It's the perfect example of how a theoretical effort can prove to be invaluable in the future, even when you can't even begin to predict the actual benefits.

Couple this with the popular "paving the cowpath" principle of web design and you'll quickly begin to see how important it is to look to the future rather than just think is terms of quick gains and immediate profits. The fact that not everyone realizes this isn't worrying me, but that big names in our industry are actively challenging these ideas is a whole different story.

The web lacks semantics. It's something that's becoming more and more obvious every day, and people telling us to stop pursuing semantic validity unless there's some immediate gain should be countered immediately and effectively, because they aren't helping us forward in our quest to provide a more meaningful web.

conclusion

Semantics matter. If not today, then hopefully tomorrow. And if not tomorrow, you know who to blame.