by Michael S. Kaplan, published on 2006/10/28 16:35 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/10/28/890105.aspx
Evan asked in one the many programming aliases:
Hi:
Anyone knows why there are 3 extra characters added to the XML file saved via XmlDocument?
I viewed the file from a hex editor and found 3 characters (0xEF 0xBB 0xBF) are added to the XML file saved.
I did a simple test to verify that:XmlDocument doc = new XmlDocument();
doc.Load(“test.xml”);
doc.Save(“test2.xml”);I created test.xml in Notepad and view it with hex editor to make sure the first char is “<” (0x3C). And when I view test2.xml, I found the 3 extra characters. These characters are not viewable and don’t affect Notepad, IE, VS.NET from viewing it at all.
I wonder what are these characters needed for?
Thanks,
Evan
Indeed these three bytes are the well known and somewhat controversial UTF-8 incarnation of the Unicode Byte Order Mark. The controversy is of course whether it is needed in UTF-8, and it comes up on a somewhat regular (though thankfully infrequent) basis....
This post brought to you by U+FEFF, a.k.a. ZERO WIDTH NO-BREAK SPACE)