2002-11-15 22:19 UTC When specifications collide
What's the charset of the following entity, assuming it is labelled as text/xml?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>What Am I?</title>
<meta http-equiv="Content-Type" content="text/xml;charset=iso-8859-1"/>
</head>
<body>
<p>What charset is this document?</p>
</body>
</html>
Take a guess.
Ok, I'll tell you.
If you are an XHTML UA, then the <meta>
element overrides the Content-Type, making the document ISO-8859-1.
However, if you are just a normal XML UA, receiving this over HTTP, then
RFC3023 says (section 3.1, paragraph 3) that the
charset must be treated as US-ASCII. However, if you are an XML UA reading
this from your local file system, then the XML
spec says that it must be treated as UTF-8.