The song^H^H^H^Hbug remains the same

by Michael S. Kaplan, published on 2010/08/20 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/08/20/10052314.aspx


In the background, The Song Remains the Same is playing. Ironic much?

The question came up the other day:

{REDACTED} uses notepad to look at info from text files that contain redirect output from commands when troubleshooting user systems.  This has worked for them until they added Russian and Greek to their multilang images.

In Windows 7 (and probably all other versions) on Russian

If you pipe the output of a command to a text file such as ipconfig.txt


Настройка протокола IP для Windows

   Имя компьютера  . . . . . . . . . : Xx
   Основной DNS-суффикс  . . . . . . :
   Тип узла. . . . . . . . . . . . . : Гибридный
   IP-маршрутизация включена . . . . : Нет
   WINS-прокси включен . . . . . . . : Нет
   Порядок просмотра суффиксов DNS . : redmond.corp.microsoft.com

……

Opening the file in notepad gives

I have tried various fonts and Cyrillic script option in notepad but have not gotten complete Russian output.

Is there a combination that works for notepad?  Word can display these correctly but Office will not be available on all of their systems.

Greek has similar issues.

Thanks,

Regular readers probably know what is going on already.

Though the mention of fonts in the question may throw people off a scosh!

It is that the console is using the OEM code page and Notepad reads non-Unicode files as if they are in ANSI code page.

Personally, the part of the message that excited me most was the fact that more companies that buikd lots of Windows images are moving into building other language images. :-)

To be honest, it makes me wonder whether adding one more encoding choice to the Notepad load list and save list

for the CP_OEMCP would make sense.

I mean, given how many commonly used command line tools will have output in these code pages, how much would it really cost to add?

I think they really ought to add this. For the sake of this very scenario.

Now on the other hand a fuller work item, adding it to the detection list (described in this blog post), would be a bit more problematic, though.

I'm going to see if any of the regular readers want to guess why detection based on simple "stupid byte tricks" (apologies to David Letterman!) would be complicated (I'll answer too in the comments, eventually)....


Yuri Khan on 20 Aug 2010 11:05 AM:

Every professional troubleshooter worth their salt would have a console-based text file viewer that supported the OEM code page. There is absolutely no reason to view ipconfig output in notepad.

Yuhong Bao on 20 Aug 2010 2:47 PM:

Yep, when I discussed the UTF-8 BOM problem in another thread, I mentioned this issue too.

Evan on 20 Aug 2010 4:32 PM:

I don't think it should be a matter of adding encodings to the load list, as it would become awkward if you don't know the encoding from the start, and you need to reöpen the file endlessly until you find the right encoding, but more of an encoding menu option similar to IE or Firefox, this way, one can see what the file looks like in various encodings, with out being required to use a different app for viewing it in a different encoding, ie. Notepad and Internet Explorer.

Doug Ewell on 24 Aug 2010 8:46 AM:

> detection based on simple "stupid byte tricks"

As it turns out, it isn't that hard to detect the encoding of known Cyrillic text (of non-trivial length) based on some fairly simple tables. The trap, of course, is how far into the weeds this effort goes when the text isn't Cyrillic after all.

Alkaline Champ on 14 Sep 2010 8:35 AM:

A KB article of it was published: support.microsoft.com/.../2360394


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day