Custom encodings in Word?

by Michael S. Kaplan, published on 2006/01/27 03:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/01/27/517732.aspx


Ivan Petrov asked in the Suggestion Box:

Hi Michael.

When I open a text file in Microsoft Office Word, Word attempts to detect the encoding standard used for text in the file. Word can automatically detect most encoding standards. When the file's encoding standard matches the default encoding standard used to save files as plain text in the version of Microsoft Windows I'm running, Word opens the file directly.

If Word cannot detect the encoding standard, or if it detects an encoding standard that doesn't match the default standard used by Windows, I must verify or choose the encoding standard from a list in the File Conversion dialog box. Word then uses the encoding standard I choose to convert the file to Unicode. I can preview the text to check whether it is readable before I open the file.

So my question is:
"If I have tons of TXT files encoded in custom (not supported by Microsoft!) encoding standard (for example: Bulgarian MIK encoding standard ) so, is it possible (in Word) and to be created and added a custom encoding standard (which to be displayed in the list of encoding standards in the File Conversion dialog box) to open and view correctly all this custom encoded TXT files? And if "YES", How To?"

Regards,
Ivan.

The answer is probably not going to be one that Ivan is going to like very much (it is unfortunate, but I have reason to believe that he is the person most often unhappy with answers he gets from this blog. Sorry Ivan!).

I believe that Microsoft Word may actually be using MLang's code page detection code to guess at what the code page may be, and I know they also do some other extra steps beyond that. But there is no mechanism to add code pages to the detection list in Word (or th list of code pages in Windows).

Now of course this is another area where we have gotten feedback. And the message has been clear that in some cases in order to move to Unicode (or even to Windows in some cases) that the need to provide data migration methods can be important. But it is really not possible to say much at this point about what will be taken from that list and put into actual product. This is the sort of thing that I will certainly comment about when I can, though....

 

This post brought to you by "д" (U+0434, a.k.a. CYRILLIC SMALL LETTER DE)


bulgarian on 9 Jan 2008 8:25 AM:

I don't believe word has that functionality. Though you can use an old program called dos2win. I don't know where you can get it though besides someone from the organization of blinds in Bulgaria.

As well I found a very nice cyr decoder on the net http://2cyr.com/decode/

Cheers.


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/01/04 Whither intl.inf in Vista?

2006/07/05 Custom code pages?

go to newer or older post, or back to index or month or day