If you have built large sites, you have faced a familiar maintenance problem: content authors, either unwittingly or in their creative genius adding horrific boogers of markup to the site. Over a year of such maintenance, the site is no longer the perfect lean and mean standards-driven machine, all-or-nothing DHTML/CSS perfection capable of invoking jealousy of the greatest Web minds. No, it’s more like a Frankenstein on the high-carb diet, the Joseph Merrick of the Web, covered in disgusting muck of Office-specific tags, with repulsive smell of FONT and O tags emanating from it.
Sure, there’s Tidy. And there are ways to delay the impending “markup junk-up” crisis, but philosophically speaking, the problem stems from the fact that HTML casually mixes content and context of page into one nice tag soup, and in that it discourages content developers to think of Web content and Web context as being two separate things.
What if we address the problem head on? What if we “disable” context features of HTML for content authors in some organized fashion? What do you think about Primitive HTML, a subset of HTML that is designed to prevent introduction of unwanted coding by the authors? Call it PHTML, if you will.
Here’s what I am thinking. Disallow tags like FONT, CENTER, SCRIPT, OBJECT, IFRAME, MAP, TITLE, ISINDEX, BASE, all of the HEAD tags, including HEAD tag itself, and of course, BODY, FRAMESET, and HTML tags. PHTML is only used to create fragments of Web content, not complete documents, so there is no need for style or meta tag declarations.
Disable STYLE, ALIGN, ID, and all other attributes that may affect style. I would leave the CLASS attribute, so that the authors could pick from pre-defined styles in the page stylesheet.
The only tags allowed in PHTML are those that are used to create content and provide semantic distinction in it, not style it. In fact, the author of PHTML wouldn’t (and shouldn’t) worry or even know how the content would end up appearing on the site — that’s why it’s called content. The designers will make it “pretty”.
It is a very idealistic thing to say, but if PHTML is standardized, and hopefully the tools are built that support PHTML (authoring, conversion, etc.), the World Wide Web of markup may just be a little better place to work with and live in.