
On Sat, Sep 4, 2010 at 12:19 PM, David Menendez
HTML and XHTML are not encodings of anything. They are markup languages defined using SGML and the XML subset of SGML. There are multiple HTML definitions of varying popularity, and the fact that we can pass some XHTML documents to a web browser expecting HTML and get acceptable results is consistent with the fact that we can pass HTML 3.0 (never implemented by any popular browser) with minimal loss.
But what is the point? The w3c originally suggested serving xhtml as text/html back in 2000 (http://www.w3.org/2000/01/xhtml-pressrelease.html.en), because they believed that it would be an easy way for people to transition to xhtml while the browsers caught up. Well, a decade later, ie still doesn't support xhtml, so perhaps their recommendation should be viewed in light of the fact that the web does not appear to be moving to xhtml at any great speed. Creating output that renders the same as both text/html and application/xml is not a trivial task. For starters, the contents of the <script> tag are treated as pcdata in html, and cdata in xhtml. And it only gets worse from there... So the choices are: 1. only focus on getting the xhtml 1.0 served as application/xml working correctly, and ie users get nothing.. 2. create xhtml 1.0 that would work correctly if served as application/xml, but serve it as text/html, and ignore that fact that some stuff might not be rendering correctly when treated as text/html. 3. create xhtml documents which render correctly whether served as application/xml or text/html, but then only serve them as text/html anyway 4. forget about how the xhtml documents render as application/xml, and only focus on how they render as text/html. Now, I think that options 1 and 2 are not even worth considering. Option 4 seems silly. If you are going to create xhtml that does not really work correctly when actually served as xhtml, then why create xhtml in the first place. This is really just html masquerading as xhtml. Why not actually create valid html and serve it as text/html, instead of creating purposely broken html? Option 3 requires extra work (and also means that you will never be able to upgrade to xhtml 1.1 or xhtml 2.0). The idea of serving xhtml 1.0 as text/html was supposed to be a transitional measure. But if you intend to do it forever.. that is not very transitional. What benefit do you get by creating xhtml 1.0 that also happens to render correctly as html ? What is the use case ? Seems that a vast majority of the usage is going to be viewing the content in web browsers. For that purpose, text/html seems superior, due to it being supported in a much wider variety of browsers. No browser will actually even try to render it as xhtml since it is being served as text/html. It seems that the only advantage of xhtml served as text/html is that it is easier to process the output. But, is anyone actually going to do that? Haddock has been around for a long time.. has anyone had a need to do that so far? And even then, is processing it via an xml parser really better than using tagsoup ? Mark suggested that it was easier to achieve multi-browser compatibility using xhtml instead of html, but I am quite certain he is mistaken. There are really three different rendering modes found in browsers: 1. standards mode 2. quirks mode 3. xhtml mode By serving xhtml content as text/html, he is getting browsers to use quirks mode instead of standards mode. That *can* sometimes lead to better browser compatibility. He is never invoking the xhtml rendering mode. If the aim is simply to trigger quirks mode, there is no need to use xhtml to achieve that. - jeremy