by Michael S. Kaplan, published on 2006/03/13 03:21 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/03/13/550191.aspx
George asked me the other day via email:
When I make the following call in Visual C++ (Visual Studio 2003), it succeeds:
char * plocale = setlocale( LC_ALL, ".65001" );
But when I try this in 8.0 (Visual Studio 2005), plocale is NULL.
Who broke the UTF-8 locale support? And when will it be fixed?
A very similar question was actually asked on one of the internal Microsoft aliases late last year. It was answered by none other than VC++ guru Martyn Lovell:
VC7 didn’t support UTF8 (or 7) correctly. But it didn’t error, it just silently did the wrong thing.
We tried to find time to do the work in VC8 to support these codepages, but we didn’t have time.
So now, at least, we explicitly error so that you know that we don’t work right in these codepages
Martyn
So George -- saying that it used to work might be an overstatement.
But look on the bright side -- if it stops failing in the next version, its a good indication that things are now expected to work properly.... :-)
# Ben Bryant on 13 Mar 2006 7:20 AM:
Ariel on 17 Jun 2009 4:01 AM:
Just saw this post.
I just found out that a piece of code that receives a string in a given encoding, and breaks it down to pieces. The nice thing is that it makes sure that it makes sure that the string doesn't break in the middle of a character.
The whole logic relies on _mbstrlen_l().
But now I find out that _mbstrlen_l() doesn't support utf-8, because _create_locale(LC_CTYPE, ".65001") return null.
Frustrating!
Michael S. Kaplan on 18 Jun 2009 2:29 PM:
This is misleading -- even when creating UTF-8 locales "worked", _mbstrlen_l() didn't work with it -- this function only ever worked with CJK double-byte code pages. Perhaps code that didn't expect it to work in places where it won't might be preferred?
Now that is what I find most frustrating! :-)
Sorin Ionuț Sbârnea on 29 Apr 2010 4:40 AM:
I would like to know if there is a way of using UTF-8 in your source code that would compile and run on Microsoft and non-Microsoft compiler (like gcc). There is an interesting question at http://stackoverflow.com/questions/688760/how-to-create-a-utf-8-string-literal-in-visual-c-2008 but I do not like the current accepted answer: no solution. Is this still true or the msvc 2008 or 2010 introduced some changes here?
Michael S. Kaplan on 29 Apr 2010 9:14 AM:
Not really related ssbarnea, except tangentially. Perhaps you wanted to put something in the Suggestion Box? :-)
Sorin Ionuț Sbârnea on 6 May 2010 3:09 PM:
Thanks Michael, but I'm sure that the suggestion box is already full of other stuff.
I would really want to see if you could come with a simple Unicode "¡qʃɹoʍ oʃʃǝɥ" application that will compile and run on Windows, OS X and Linux.
Michael S. Kaplan on 6 May 2010 5:28 PM:
It is not full at all; there is one item in it at the moment.
I have a lot of other topics, and that list is one of my later to-do lists; I do not track random comments in unrelated blogs. So if you are okay with me potentially never getting to it then I suppose here is fine too....