Complex class designs begat unintuitive interactions begat unexpected results begat a blog

by Michael S. Kaplan, published on 2008/12/19 03:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/12/19/9240672.aspx


Frank's question is one any reasonable person could have:

So, why is XmlWriter producing a document with encoding = UTF16 when I specify UTF8 ?

static void Main(string[] args)
{
    XmlWriterSettings xws = new XmlWriterSettings();
    xws.Encoding = Encoding.UTF8;
    StringBuilder sb = new StringBuilder();
    XmlWriter xw = XmlWriter.Create(sb, xws);
    xw.WriteStartDocument();
    xw.WriteElementString("Element", "Value");
    xw.WriteEndDocument();
    xw.Flush();
    xw.Close();
}


--
Frank

That code prodcuces XML with the following preamble:

<?xml version="1.0" encoding="utf-16"?><Element>Value</Element>

I can see what he was talking about. Weird!

Someone pointed out that looking at the XmlWriterSettings.Encoding property documentation would explain what was going on:

Remarks
The XmlWriter encodes a buffer of characters all at once, rather than character by character. An exception is thrown when the Flush method is called if any encoding errors are encountered.

The Encoding property only applies to the XmlWriter instances that are created either with the specified Stream or with the specified file name. If the XmlWriter instance is created with the specified TextWriter, the Encoding property is overridden by the encoding of the underlying TextWriter. For example, if this property is set to Unicode (UTF-16 ) for a particular XmlWriter, but the underlying writer is an StreamWriter (which derives from TextWriter) with its encoding set to UTF8, the output will be UTF8 encoded.

If the XmlWriter instance is created with other output parameters, the Encoding property is ignored.

Personally, I think that is all designed slightly more confusingly than it had to be. But that is just me. :-)

 

This blog brought to you by 𝌏 (U+1d30f, aka TETRAGRAM FOR DEFECTIVENESS OR DISTORTION)


no comments

go to newer or older post, or back to index or month or day