selling html pt1/the theory
It's sad, but true. 2010 is rapidly approaching, but as front-end developers we still have a tough time selling the importance of well-written html. Many attempts have been made these last 10 years, but it all amounted to very little. Clean html is the first priority to go whenever problems arise within a project. For those of you still fighting with much conviction and spirit, the next two articles will help you conquer those who oppose you.
What puzzles me most is how differently html is approached when it comes to quality. Compare it to programming languages or human languages and you'll see a frightening gap in quality perception. Of course there are some understandable reasons for this, but after 10 years of hmtl-awareness you would assume that people would start to grasp the need for improvement.
about languages
We all know that html is a descriptive language, with its own spelling and grammar rules. Much like the human languages in fact. We write it to describe elements within a page, giving meaning to components and making it possible for automated features to recognize these components and process the data for whatever reason possible.
The html language is a very simple language with few words. To cover for the unknown elements we have the div element, equivalent of the English word "thing". Further specification is done through classes giving our "thing" extra semantic meaning. For elements that do have an equivalent in html, we use the assigned tags. This all sounds very logical, but the reality is quite different.
draconian error handling
The difference? In case of human languages it's our brain doing the error handling. We are interfacing directly with the language. In case of html, we interface with the product of the language (the actual web page), not the language generating the page. That's why it's not immediately apparent when the html of a page is full of grammar and spelling mistakes, as the browser effectively hides (almost) all the ugliness from us.
Ai coud rait laik dis and with a little bit of effort you would be able to read it perfectly well. Still, people would be quite annoyed if I wrote entire articles likes that, no matter how interesting the content. But when talking about html, people don't seem to care, even when it's bordering on complete nonsense. Since there's an automated service trying its best to cope with these errors, it appears to be free for all.
history
To be fair, this way of handling html did help to launch the internet. It eclipses some of its finer points and helped in getting things online for people to see. In many cases, badly marked up content is better than no content at all. But for professional websites, it is time we stop ignoring the potential of html, as its current state is actively hindering the progress of the internet today. Semantics in combination with automated processing is an area still very much underdeveloped, partly due to bad html structure and grammar.
conclusion
While there are little arguments against well-written html, it's a sad fact that all I've written above will usually get you nowhere. For now, html is still considered a low priority, and when CMS or other automated html-generating tools remain spewing grammatically incorrect code.
When you compare it to human languages you will make people understand, but at the same time the argument is too theoretical to have much weight when a crisis is looming. Next article will delve a little deeper into more practical weapons to battle the rapers of html. Stay tuned.

The problem is a lot of people can very easily write a webpage with little understanding and there is more rewards for the content producer to write poorly formed HTML.
While I will sit down and nut through XHMTL and CSS to get the layout working across browsers, my friend will just do a table layout in half the time.
While I will stick to the well formed XHTML I admit I understand why people would build sloppy code.
Benefits of poorly formed HTML: 1. Takes a lot less effort. 2. You do not need to think. 3. Majority of people wont know you cheated. 4. You can archive cool effects. 5. Most browsers will not punish you. 6. 9 Grid Tables is punch out by still a lot of WYSIWYG. 7. How is multiple div wrappers any better? 8. Less coding e.g. <center> tag vs <h1 class="maintitle"> .maintitle {text-align:center;}
So while us in our community understand the importance of coding correctly it is easy to understand those who dont.
Dale
Sorry, I thought your CMS would remove the tag enclosures and replace them with < and >
Have you seen this:
http://rebuildingtheweb.com/en/why-is-valid-html-important/ ?
The real issue at stake in the recent death of web standards and xhtml is that it is now much harder for individuals and smaller groups to innovate including browsers.
Look forward to reading the next installment.
Dale: yeah, sorry bout that. I will look into it as quickly as possible :)
As for the things you list, I don't agree that it takes less effort. It takes effort to start learning from scratch, and some designs might be easier to do with tables, but as someone who switched I can testify I find it very hard to do table designs these days. It's not as if you didn't need hacks and other crap to make a table layout work.
As for wrappers, a div wrapper is what it is. A meaningless structural element. It's not wrong or incorrect, just obsolete from a semantic/structural point of view. This is different from writing semantically, grammatically or structurally incorrect html code.
Telga (& Vlad): The idea that it hurts innovation is slightly referenced in this post too, though I like the focus in Vlad's article. It's definitely a good reason, but one that will sell extremely hard to clients (and partners). You see, most clients want a website, they don't want to invest in innovating the web. So if the "skinning" of a CMS costs a lot extra, they might consider it a waste of their cash. Which is fair in some way, but still a big problem for us.
I guess the HTML interpreters adhere to Postel's Law, which isn't a bad thing either: http://en.wikipedia.org/wiki/Robustness_principle
I guess most people fall into that trap. I agree with the law by the way. Like I said, it's necessary to get content online constructed by people with little knowledge of front-end principles and design. But for professional sites, it's a whole different story.
* required fields