Valid XHTML - How to Lose the Non-SGML Characters
Posted on
05/31/2007
by
Colin
0 Comments
Here at Plexus, we pride ourselves on our use of Web standards, which lead to greater usability and searchability, in addition to better preparation for future browsers and other Web technology. We proudly display links to validation services at w3.org for our XHTML 1.0 Strict and CSS, and they show that we're writing valid code.
On our sites that allow users to enter content, however, we sometimes encounter validation errors such as:
This page is not Valid XHTML 1.0 Strict
Error Line 60 column 118: non SGML character number 128.
[the problem here was a left quote pasted in from Microsoft Word]
We've looked at writing scripts that convert each "non SGML character" to ones that will pass XHTML validation, but it ends up being very difficult to track down each individual invalid character.
The better solution (assuming you're like us and have been using ISO-8859-1 encoding) is to just change the character encoding UTF-8. That is, drop a META tag in your document template like this:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
And you should be home free, so now you can feel confident about your code still passing XHTML 1.0 Strict validation even though you're letting clients paste Microsoft characters in.
Space Highlight

I created a site for my trumpet playing and teaching - check it out. I play for weddings and other events!
Loading recent Flickr images...


Post a Comment