WaSP Asks the W3C

Serving XHTML with the Right MIME Type

In the previous issue of “WaSP Asks the W3C”, we examined how to correctly specify the character encoding for a document. In this edition, we consult the W3C’s Quality Assurance Group about serving XHTML documents with the correct MIME type.

MIME has its origins as an extension to email and is reused by HTTP as a means to declare the type of content (or media type) being served. Each resource has a specific MIME type that is constructed of two parts: the main type and a subtype, which are separated by a slash “/”. The MIME type indicates to the user-agent (as it receives the document) how to handle and treat it accordingly, thereby allowing you to associate a particular application or behavior to the particular media type in your browser.

WaSP asks

Which MIME type should XHTML be served with?

The W3C Responds

The short answer is application/xhtml+xml, as described by the XHTML Media-types W3C Note, but the long answer is somewhat more involved and provides a couple of alternatives.

Why not text/html?

The main reason to use a new MIME type for XHTML is that it is an XML language, which means that it is subjected to stricter validation and hence less prone to becoming the tag soup that too many people have called HTML; thus, it is reasonable to indicate the difference to browsers so they will be able to handle the resulting code more efficiently.

The fact that XHTML is based on XML also involves important syntax differences — the most significant being that empty tags such as <br> need not be closed in HTML whereas they must be done so in XHTML (à la <br/> ) . These changes in syntax are another reason to clearly distinguish HTML from XHTML , and thus use a different MIME type.

But some browsers don’t know about application/xhtml+xml.

Indeed, and that’s one of the biggest current issues with the adoption of the new MIME type, especially since Internet Explorer doesn’t recognize it (at least for any version up to 6.x on both Windows and Mac OS). Of course, this is a common problem with adoption of new technologies and it usually improves over time. However, for the time being, there are ways out of this vicious cycle:

The first technique makes your content understandable by the vast majority of web browsers, but in doing so, you lose all the advantages of having a different MIME type: the power of being treated as XML, allowing your document to be distinguished from tag soup, getting fast track rendering on modern browsers.

The second technique caters for existing browsers while keeping the new MIME type for those which are able to understand it; the down side is that it can be quite tricky to implement on your server, depending on how much access you have to its configurations and settings.

Alternatively, you can serve your XHTML (any version) as application/xml, or even as text/xml. However, in serving your XHTML document as text/xml you may run into issues with character sets because the rules which apply to text/* MIME types are more complex than those for application/*. It is also important to note that for either of these MIME types, Internet Explorer will display the source code instead of interpreting it as XHTML .

How do I set up the content negotiation you’re speaking about?

This depends on the Web server you’re using: if you’re not an administrator of the Web server where you want to use this type of content negotiation, point your Web server administrator to this document and ask him or her to set it up as needed for you.

If you’re the administrator of the Web server in question, you can have a look at the linked techniques for your web server — or even better, send the techniques that you know to the W3C Web Standards Education list (publicly archived) so we can add it to our knowledge base.

To sum it up

Let’s try to summarize what we’ve just discussed:

(X)HTML version Recommended MIME type Limitations in browser Alternate MIME types Techniques
HTML 2.0,3.2,4.0,4.01 text/html none, but this MIME type has been abused very often as an umbrella for tag soup N/A N/A
XHTML 1.0 application/xhtml+xml Not recognized by Internet Explorer 6.x and previous versions
  • if using the backward compatibility guidelines, text/html
  • application/xml (or text/xml, but with much caution with regards to charset setting)
XHTML 1.1, XHTML Basic, XHTML profiles application/xhtml+xml Not recognized by Internet Explorer 6.x and previous versions
  • application/xml (or text/xml, but with much caution with regards to charset setting)
N/A

References

Discussion

For clarification and discussion on this topic, please address your comments and questions to the W3C Web Standards Education list.

To subscribe to the list, send an email to [email protected] with “Subject: subscribe”. You can read archived posts at http://lists.w3.org/Archives/Public/public-evangelist/.