by Michael S. Kaplan, published on 2006/07/05 04:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/07/05/656283.aspx
There is an old expression about a person being like a dog with a bone -- meaning that they really want to keep at something in particular, and they don't want to let it go.
We have something similar here in this blog, and it is regular reader Ivan Petrov, who is really looking for a way to extend code pages, and he has asked the question in various ways in the past. They are interesting questions, and the issues tied up in them are also interesting, though the answers are not always what he might like. :-(
The latest question, in the Suggestion Box:
It will be possible some day to add custom Codepages in Windows? Because, as we can see here:
http://www-03.ibm.com/servers/eserver/iseries/software/globalization/codepages.html,
and here:
http://www.i18nguy.com/unicode/codepages.html
there're too many Codepages not supported by Microsoft.
Regards,
Ivan.
As I have mentioned before, code pages are never going to be enough, though. The answer here, the only answer, is Unicode. Given the limited use of all code pages that are not "default system code pages" of a locale making the utility of most of the code pages on Windows pretty limited anyway, the ability to add more such code pages would be of even more limited use.
Certainly the migration of legacy data into Unicode is an interesting scenario, and one that in a past life as a consultant I have often worked on, providing short term and long term solutions. The need to do such transformations conformant to various national and industrial standards has existed for quite some time, and will likely continue to exist.
Hell, if you think about many of IBM's EBCDIC code pages, they were often provide by IBM for customer needs to do such mappings. It is a business for everyone, really....
Does that make it important to support custom code pages on Windows at some point?
Well, maybe -- if a compelling case can be made for such data migration -- I have heard rumor that IBM has in the past produced EBCDIC code pages for individual customers, so clearly many companies have felt such pressures before.
More important than the opening up of functionality that does not exist today on Windows but which such opening up has been happening, such as the work that has happened with fonts, keyboards, and locales?
Well, maybe not -- since mappings such as the ones on the IBM site or on Tex's are hardly out of the reach of developers today....
The big use that we all have for code pages is actually as repetoire definitions -- because in this world where it is hard to know what letters are important to a language, such a standard is a good indication of what someone felt was a reasonable subset!
In the meantime, the owner of encodings on Windows is likely to have thoughts on the issue (though the first answer to just about any question about code pages from him is Use Unicode!), so you can Shawn if you are curious. :-)
This post brought to you by Ӹ (U+04f8, a.k.a. CYRILLIC CAPITAL LETTER YERU WITH DIARESIS)
# Nick Lamb on 5 Jul 2006 6:25 AM:
# ML on 5 Jul 2006 6:33 AM:
# Michael S. Kaplan on 5 Jul 2006 10:24 AM:
# Michael S. Kaplan on 5 Jul 2006 10:24 AM:
# Dean Harding on 5 Jul 2006 10:26 PM:
# John Elliott on 6 Jul 2006 7:14 AM:
# KJK::Hyperion on 6 Jul 2006 7:31 AM:
# Michael S. Kaplan on 6 Jul 2006 10:14 AM:
referenced by
2007/05/19 Custom code pages? Redux
2007/01/04 Whither intl.inf in Vista?