]>
XHTML has been around long enough now that most people have heard of it, and most web developers know what to do to ensure their code produces valid XHTML. You will often see little notes in page footers claiming "Valid XHTML" or similar, and often enough, it even is valid XHTML, in some flavour or another.
Unfortunately, however, the overwhelming majority of valid XHTML web pages might as well not be, because the browser (and other agents) has no idea that they are, and treats them as though they are the bog-standard "tag soup" anyway.
That is because the MIME type specified in the HTTP Content-Type header claims that the page is "text/html" (or similar), which is what the browser pays attention to, not your DOCTYPE element.
The culprit — as if you couldn't guess — is Microsoft Internet Explorer. I first wrote this article back in 2008, so the situation has likely (hopefully?) improved, but back then IE8 was just emerging and still couldn't quite handle XHTML properly. If you tried serving it XHTML documents with the correct MIME type (application/xhtml+xml), it would simply offer to download the page for you.
But for the rest of us, serving XHTML for what it really is works just fine on "all Mozilla-based browsers, such as Mozilla, Netscape 5 and higher, Galeon and Firefox, as well as Opera, Amaya, Camino, Chimera, Chrome, DocZilla, iCab, Safari, and all browsers on mobile phones that accept WAP2. In fact, any modern browser." —W3C FAQ
So why bother with XHTML at all if it's going to be yet another browser compatibility issue? I'm not going to try to answer that question here, I'll assume that anybody reaching this article has already identified a need for using XHTML-based technologies (stuff like XForms, SVG, RDFa and MathML) or XML-based tool-chains (probably XSLT) with their Joomla! pages.
In many cases, you might want to just go ahead and use XHTML markup and advise users to download a better browser (or apply some of the various workarounds for IE) if they want to view it. But probably you still want your pages to load on browsers that don't handle the XHTML content type. Fortunately, we can do this using the content negotiation mechanism built into the HTTP protocol, using the HTTP_ACCEPT header. Basically, you can just check this header that is sent by the browser, to see which content types it supports (in order of preference), then set the Content-Type header appropriately in code.
However, Joomla! won't do anything like that for you as standard, for a number of good reasons. One such reason is that if you tell the browser to expect XHTML it will try to streamline the rendering process and load it straight into a "real" XML DOM. This means that if your XHTML is broken, users are going to just end up with an ugly error message from the XML parser. Authors of front-end Joomla! templates have thankfully gotten better over the years, but extension developers haven't.
You will need to be prepared for some hacking if you want to do XHTML in Joomla! For starters, the Joomla! Administrator (and certain parts of the front-end that use forms) won't work as XHTML. One of the main reasons for this is that Joomla! uses an old DOM Level 0 means of accessing the element named adminForm (document.adminForm) in pretty much all of it's Javascript, which won't work with the XML DOM you get from a genuine XHTML document.
Sadly, there are lots of other places where XHTML is broken in Joomla! In a few cases it doesn't even output well-formed XML, but thankfully most of those are in the administration back-end. It's relatively easy to get the front-end working fully as XHTML if you're not doing much more than publishing content using the core components, as evidenced here. But be aware that there are multiple flavours of XHTML, the strict variety that I use here does not contain entity definitions for things like or © that you might be used to using, and you'll find those entities still used in most "XHTML valid" templates.
So if you're brave enough to take the plunge, you might find it helpful to use a small Joomla! system plugin that wrote for setting the correct content MIME type. I just wanted an easy way to switch XHTML on or off, for development purposes, and to automatically exclude the admin site.
Once installed and enabled, all it will do is check for browser support via HTTP_ACCEPT, and change the Content-Type header where possible. It doesn't try to do anything clever that would involve regex or parsing the document (e.g. to change the meta tag for the content type), but it's a place to start without needing to reconfigure your webserver to manipulate the headers.
If you find anything of use or interest here, please consider supporting the Kiva project with a loan. If you don't already have a Kiva account, the first $25 loan is usually free.
I am in the process of rebuilding this website. Please excuse the mess.
If you came here for something I used to publish, check back in a few days.
Anything from the old website that might still be useful will probably reappear, but I want to take the opportunity to check and update everything before migrating it.
Will Daniels is an Independent IT Consultant in Birmingham, UK
I specialise in Knowledge and Data Management using Semantic Web technologies as well as conventional RDBMS systems.
I work primarily with Linux platforms and am also a fair Linux System Administrator.
Presently working at CloudTomo.
https://willdaniels.co.uk/foaf#webid