FEATURES AND COMMENTARY
San Diego, CA — As Paul Festa reported for CNET, efforts to craft a new language for building Web pages may stall because of rifts within the development community, which is debating a proposed standard.
Last month the World Wide Web Consortium (W3C) proposed Extensible Hypertext Markup Language (XHTML), turning it over to the development community for a final review period, but since then the process has lost some momentum. Developers have not been to agree on a naming convention that orients Web browsers to their tasks.
XHTML would rewrite Hypertext Markup Language (HTML), the Web’s most commonly used language. The rewrite would make it easier for individual industries to design Web pages that meet their specific needs.
At stake is how browsers will make sense of the Web when it is written in a potentially infinite variety of individually tailored languages.
Web authors use HTML to design basic Web pages. HTML tags designate format and page elements such as paragraph breaks, color, tables, and lists. With XHTML, HTML would be rewritten in Extensible Markup Language (XML), a technology for creating Web languages.
XML defines a tagging structure for the creation of industry–or task-specific languages. One example is MathML, which creates Web pages with mathematical content and functions.
Some XML developers are complaining that the XHTML proposed recommendation is playing fast and loose with the use of namespaces. XML namespaces, which the W3C recommended in January, let Web browsers distinguish between tags from different XML-based languages. In other words, if a document had both MathML tags and tags from another language, the namespaces would tell the browser which tags belong to which language and how to interpret them.
The W3C’s XHTML working group has proposed a trio of namespaces for XHTML. But the working group’s critics think there should be only one.
In current implementations of XML, browsers refer to a file called a Document Type Definition, which defines grammar or syntax of a particular XML-based language. One W3C effort is underway to define XML schemas, which will give XML developers more power and flexibility in specifying how their tags are interpreted. The schemas’ specification, which has not reached official recommendation status, is intended ultimately to replace the Document Type Definition.
The current controversy arises over the practice of using Web addresses, or URLs, as labels for separate namespaces. In other words, a Web document would use a URL to tell the browser what kind of XML it is reading and how to interpret its tags.
Though URLs traditionally point to a particular Web page, in the case of namespaces they only differentiate one namespace from another, much like a telephone number that doesn’t connect to a phone line, but only provides a unique identifying number for a person or residence.
But the W3C working group devoted to the new XHTML standard thinks the namespace URL should not only function as a unique identifier, but also should steer the browser to a definition of how the language corresponding to that namespace works. The W3C working group wants the namespace URL to do double duty in identifying the namespace and pointing to a document that will tell the browser how to read the page.
Namespace purists oppose that move, in part because it will complicate the interpretation of pages written in XHTML.
“What the working group is doing is building a Tower of Babel by dividing up namespaces for a single language,” said Don Park, president of XML software and consulting firm Docuverse. “Everyone knows what the paragraph tag means, but according to the working group proposal a single program will have to identify three paragraph tags.”
Working group members counter that the three namespaces within XHTML are necessary to delineate three varieties of HTML: strict, transitional, and frame-set.
Strict XHTML, the smallest namespace, excludes all presentational markup tags; those tags are to be included separately in style sheets. Transitional XHTML resembles HTML, which includes presentational markup elements. Frame-set XHTML includes presentational markup as well as the use of frames, the common separate smaller panes within Web pages.
The debate over XHTML namespaces has dominated the discussion at the “xml-dev” newsgroup for XML developers.
“Namespaces are about unique identification; they are not about meaning…” wrote Sun Microsystems’ Jon Bosak, chair of the W3C’s XML coordination group. “The main argument for specifying three namespaces for XHTML rests on the assumption that there is a one-to-one association between namespaces and schemas. This is not true.”
W3C representatives characterized the contentious debate over XHTML namespaces as a normal part of the W3C review process.
“This is all part of the process,” said Ian Jacobs, editor of the XML specification. “There is not yet a consensus in the community. If there continues to be a huge disagreement, something may change.”
XHTML’s review period ends September 22. The W3C typically takes two weeks to review comment before deciding whether to officially recommend a proposal.