by Michael S. Kaplan, published on 2005/01/05 21:07 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/01/05/347394.aspx
I was looking at Elyasse's Weblog and was reminded of one of the coolest feature entries in Whidbey.
I think I have been waiting roughly 112 versions of the Microsoft compilers for this. Well, probably not that many but it does feel like that....
New in Whidbey! From the help:
C4819 occurs when an ANSI source file is compiled on a system with a codepage that cannot represent all characters in the file.
To resolve C4819, save the file in Unicode format.
This is incredibly cool.... :-)
This post is sponsored by "©" (a.k.a. U+009, COPYRIGHT SIGN)
# Anon on 8 Jan 2005 4:21 PM:
# Michael Kaplan on 8 Jan 2005 5:35 PM:
# Anon on 9 Jan 2005 8:03 AM:
# Michael Kaplan on 9 Jan 2005 9:38 AM:
# Vladislav Vaintroub on 6 May 2008 6:05 PM:
Too much i18n does not seem good for the compiler.
Michael, by all respect I cannot share your view on this "incredibly cool" feature. I think it is incredibly uncool.
The bad thing about this warning can result to an error like here
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=341454
is that : C strings, the null terminated arrays of bytes,
do not have any encoding information per se, i.e are supposed to be treated as opaque arrays of bytes. Now, I have a perfectly valid
C file, containing ASCII-only, except for UTF8 bytes instrings (UTF8 for a good reason, I intend to edit this file in UTF-8 editor). And such a file will break with incomprehensible message on Whidbey on Japanese Windows now
The connect bug is now resolved with Won't Fix, so I can not even hope that this will be fixed with the next version of the compiler.
Alternatives for me?
1)Documentation and support says - add a BOM to the file. No way, then it will break on older compiler and on non-Microsoft compilers.
2)#pragma setlocale?
Does not work
3) convert strings to their hex-byte-array array form
something like
char foo={0xba,0xad,0xf0,0x0d,0x00}?
Will work, will look ugly and I'll have to forget about editing this file in a my wonderful UTF8 -capable editor , VS2005 IDE.
Or forget about getting this file compiled on Japanese Windows. It is not important *for me* anyway. This compiler works quite well on latin1 territories:)
# Michael S. Kaplan on 6 May 2008 6:37 PM:
Since the BOM does exist, you could also petition the other compilers to start recognizing it, too. I'm sorry, but I agree with Jonathan Caves on this issue -- use the BOM and you are golden.
referenced by
2010/06/05 Not all developers speak English. Dammit.
2008/05/19 Everyone seems averse to the BOM these days; Should we blame TSA? :-)
2005/12/09 More on the C4819 error
2005/11/23 100% roundtrip ASCII? 100% roundtrip ANSI?
2005/10/28 It isn't Unicode, it's Double Secret Unicode!
2005/01/08 How does it detect invalid characters?