When is a backslash not a backslash?

by Michael S. Kaplan, published on 2005/09/17 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/09/17/469941.aspx


The character in question is U+005c, the REVERSE SOLIDUS, also known as the backslash or '\'. It is the path separator for Windows, which is encoded at 0x5c across all of the ANSI code pages.

Since path separators are a pretty important requirement, the title of this post may seem a little scary -- how could it not be a backslash, a reverse solidus?

Well, on Japanese code page 932, 0x5c is the YEN SIGN, and on Korean code page 949, 0x5c is the WON SIGN.

Which is not to say that 0x5c does not act as a path separator -- it still does. And which is also not to say that the Unicode code points for the Yen and the Won (U+00a5 and U+20a9) do act as path separators -- because they do not.

Of course the natual round trip mapping between U+005c and 0x5c happens on all code pages, and both U+00a5 and U+20a9 have one-way 'best fit' mappings to 0x5c on their respective code pages. This requirement technically went away with Unicode, when the characters were encoded separately.

However, the issue is not a simple one of there not being space in the old code page and lots of space in Unicode, where customers will instantly move away from the not backslash path separators.

In practice, after many years of code page based systems in Japan and Korea using their respective currency symbols as the path separators, it is believed customers were simply used to this appearance. And there was therefore little interest in changing that appearance (when the system settings were Japanese or Korean) to anything but those symbols.

To support this expectation, Japanese and Korean fonts, whenever the default system locale is set to Japanese or Korean, respectively, will display the currency symbol rather than the backslash when U+005c is shown.

But whether or not this is really what customers want is still an open question. Andrew Tuck of PSS here at Microsoft noted:

When one of my customer’s from Korea was visiting here, I asked him if it bothered him that the backslash doesn’t appear as a backslash. It did bother him, and he believes it bothers most of his countrymen. However, he was fatalistic about it, "What can we do to change it. It’s been this way for a long time. We are used to it."

Hardly a glowing recommendation, is it?

And as Norman Diamond noted in his comments on this very blog (in this post), there are plenty of people in Japan who may not care for the convention, either.

Of course there is no 'right' answer here, and I would imagine that you would find plenty of people who would be unhappy with such a change, just as there are those who would be unhappy with the status quo. Which perhaps explains why the status quo seems to be as it is -- those people who would like a change are resigned to the idea that it may never happen. And so they are now used  to it....

 

This post brought to you by "\", "¥", and "" (U+005c, U+00a5, and U+20a9, a.k.a. REVERSE SOLIDUS, YEN SIGN, and WON SIGN)


# koji on 15 Oct 2005 8:19 AM:

IIRC, if you go back a little further, ASCII defines 0x5c as one of the "localizable" code point, and that is why several countries have several different glyphs here.

DOS 2 made a mistake by choosing such a localizable code point as the path separator. Well, I don't think anyone can blame on it though.

Whether we should fix this glyph or not is as good open question as whether we should fix the path separator to "/", which is not a localizable code point in ASCII.

And, although both are good questions, I don't think anyone could fix either.

# not given on 11 Nov 2007 6:02 PM:

Was a work around ever found for this issue? Is there a way to keep Japanese language support in Windows and have the backslash ( \ ) display correctly in address fields instead of the yen symbol?

It works correctly when typing in fourms, but not in the address fields? Why is this?

# Michael S. Kaplan on 11 Nov 2007 6:09 PM:

It is all about the font selected and sometimes the technology doing the rendering -- and for every person who considers one behavior to be a bug, there is another who thinks the other is a bug.

Which essentially makes it unfixable, at least for everyone....

# Mike on 5 Jan 2008 6:26 PM:

I got the same thing, after playing Clannad the primary fonts seem to be MS Gothic lol, yet it still types as \ here.

Michael S. Kaplan on 14 Feb 2008 10:42 AM:

spitzak on 22 Jan 2009 10:30 PM:

The reason 0x5c still prints a Yen is not because of Windows paths, but because of Windows *text* files. There must be vast numbers of Japanese Windows text files where 0x5c is used *as* a Yen symbol. You can't make them all suddenly display backslash. And there is not likely any intelligent way to figure out whether a Yen or backslash is intended and translate the documents.

Anonymous on 22 Apr 2009 4:04 PM:

I recently set up Japanese language support on my Vista machine - it was a surprise to see cmd.exe rendering paths wrong.  My fix?  Use xterm from the cygwin toolkit as your terminal.  It looks better than the Windows terminal as well.

Also, >>spitzak :

It would be an easy enough fix, actually.  Just add another backwards compatibility option "Render \ as ¥ (Use for pre-[year] text documents)" to the menu available when alt-clicking a file.  That way, people who needed the old style would be able to use it and the general population would be able to see actual '\' characters where they belong.

Russ on 9 Jun 2009 6:17 PM:

Since '\'s are contained in text files where the codepage isn't defined one way or the other, the only sane thing to do going forward is to define it as a '\' and not a Won or Yen. The currencies differ by a factor of about 10. Since we now regularly send files around this thing called the "Internet", it should not be guessed that it is a Won on Yen. Stop the insanity from going any further? Why wasn't this changed when unicode support was added to Windows?

If the '\' is contained in a document where the codepage is defined, then sure, make it the appropriate Yen or Won symbol and if converted to unicode, save it as the appropriate unicode symbol.

Pat on 18 Jun 2009 11:14 AM:

The last post is totally relevant.  This really should be fixed.  For people in IT who need to make screenshots of file paths and realated documentation this is a real pain.

Michael S. Kaplan on 18 Jun 2009 1:33 PM:

People in IT who need to make screenshots need to be more respectful of decade long differences in their target markets. And if they want to support customers in other countries they need not insist that the other countries change to make the IT life easier....

Also, people in IT making documentation for Japan and Korea should have the documentation in Japanese and Korean anyway -- if not then their yen/won/solidus issues are *not* why their docs are not appreciated!

Josh on 27 Jan 2010 9:26 PM:

just another example of how incredibly worthless Windows is.


referenced by

2013/10/31 At long last, explaining the yen/won/backslash bug plausibly

2007/10/18 Trying to get people to use Unicode? Lock and load, baby!

2007/03/28 A yen for Yen may be left unsatiated

2006/05/26 Two chickens in every pot, and an ASCII in every code page

2006/03/17 On the fuzzier definition of a 'Unicode application' on Win9x....

2005/11/01 I WON to talk about the YEN

2005/10/14 The mission of GIFT

go to newer or older post, or back to index or month or day