The Unicode train is leaving the station, 2010 edition (aka How to do a feature in two versions)

by Michael S. Kaplan, published on 2010/01/11 09:16 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/01/11/9946618.aspx


I got an email from Mike the other day:

Hi Michael,
 
Just a quick FYI, a bit of great news (I guess 15 years is as good a time as any).  VC2010 now generates Unicode RC files (when using the project wizard to generate a new app).  Wow, I'd never thought I'd see the day.    It was a great day when VC2005 actually supported opening and saving of Unicode RC files, but this is the icing on the cake. 
 
Now all those people using obsolete source control systems and diff utilities are really gonna have to update to support these newly generated projects that include Unicode RC files or they're in for a surprise :)

Woo hoo!

I agree this is very good news, and very good icing on this particular cake.

Remember when I talked about the first part of this, in The Unicode train is leaving the station, back in 2005?

I remember inside Microsoft how awful the diff'ing sitution was until someone updated WinDiff to support Unicode; Mike has a good point about those other diffing programs!


Sam on 11 Jan 2010 3:09 PM:

Where is the unicode-enabled windiff available from?

muntyan on 12 Jan 2010 10:25 AM:

What's "unicode" here, is it UTF-16 or UTF-8?

Karellen on 12 Jan 2010 2:59 PM:

"Unicode" in Microsoft/Windows world means "UTF-16". Actually, it more likely means "UCS-2" (e.g. in NTFS), but it could be UTF-16.

Windows can't do UTF-8 as MB_LEN_MAX is 2, which is a compile-time constant and therefore cannot be changed without fundamentally breaking the Windows ABI, which is just not possible.

It would have been possible to up MB_LEN_MAX in 64-bit Windows when there was necessarily a brand new ABI, but I don't think they did and it's too late now. :-(

Andrew West on 13 Jan 2010 4:36 AM:

My version of WinDiff supports UTF16LE if it has a BOM, but apparently no other flavours of Unicode. It would be nice if it supported UTF8 as I use it a lot, and most of my files are UTF8 encoded.

muntyan on 14 Jan 2010 5:46 AM:

Karellen, windows can do UTF-8 all right, it doesn't matter what C wchar_t is. What you pass to windows API functions is one thing, what you store on disk is another.

Not using UTF-8 for text is just screwing up everybody else (including even Microsoft tools), you need every tool which reads text files to support UTF-16 even to read simple English text! We don't care much about others, do we?

Mihai on 15 Jan 2010 10:50 AM:

Nice to see it, but the MFC combo-box bug (storing the combo-box content as hex) still generates  question marks. (All the hex values are 3C).

(https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=276957)

Although storing text as hex values is horrible for localization, having the content of the combo-box "right there" is very popular with the developers. And for localization is also nice to be able to test if all entries fit the width of the control.

Bob Bobson on 14 Apr 2012 7:10 PM:

I too would like to know if/where a Unicode copy of WinDiff is available. The 6.1.7716.0 version certainly doesn’t work, at least not for file and folders with Unicode names.

Michael S. Kaplan on 15 Apr 2012 1:35 PM:

As far as I know, it has never been released externally?


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2012/03/26 The Unicode train left the station YEARS ago, in fact! (2012 edition)

go to newer or older post, or back to index or month or day