Double Bite Character Set

by Michael S. Kaplan, published on 2009/04/07 10:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2009/04/07/9534846.aspx

It was just this last Sunday that long time reader Yuhong Bao wrote over in the Suggestion Box:

BTW, someone mistyped DBCS as Double Bite Character Sets:

http://www.microsoft.com/downloads/details.aspx?FamilyID=0e56788b-32e8-459d-b9c9-b9155a4836b4

I was wondering if DBCS is really that painful.

Interesting question, no? :-)

In case someone fixes the bug, here is the screenshot from Update for Windows XP (KB961503):

The KB article (961503: You cannot input characters as expected by using a non-English Input Method Editor in Windows Live Messenger on a Windows XP-based computer) does not have this problem, as it does not mention either BYTE or BITE. It is just the download page.

There are those who would claim that DBCS does at times bite.

So perhaps it is amusing to think of this as a Freudian slip of some sort.

Though I suspect it to be more likely either a simple typo or one of those "typed by someone who does not understand after being briefed by someone who does" kind of situations....

Or maybe it is an unexpected example of someone making the mistake I described in We're back and we're embarrassing ourselves? (aka Making your localizer's life easier, Part 2) about how people try to spell out acronyms to avoid confuson:

And the guidelines themselves often fail to assist: for example, in documentation on the first occurrence of an acronym one is expected to spell out the acronym. But if one finds GDI confusing one is unlikely to find GRAPHICS DEVICE INTERFACE to be the magical road to understanding. In fact, the guidelines can often increase confusion!

Even if the person doing it did not know what it really stood for!

Thinking back to the days of

Single byte character set (SBCS)
Multibyte character set (MBCS)
Double byte character set (DBCS)

I really do remember times that I really felt like saying (or even screaming) THIS BITES so there is a part of me that is inspired by the "Freudian slip" explanation. :-)

Using Unicode is always easier. Always.

So maybe we should all treat the B in SBCS, MBCS, and DBCS as BITE rather than BYTE. :-)

The Unicode characters were released from their original contracts when SiaO went an hiatus; only time will tell if the Characters Union (AFL-CIO) is willing to negotiate new contracts for the characters it represents...

Erik F on 7 Apr 2009 8:11 PM:

I don't think that the Unicode characters would want to sponsor this post anyways because this was about MBCS characters (and you know how the two don't get along!)

Michael S. Kaplan on 8 Apr 2009 12:40 AM:

Of course they would -- they love sticking it to MBCS, any chance they get!

Yuhong Bao on 11 Apr 2009 11:16 PM:

"In case someone fixes the bug"

It did get fixed, look at the current download page.

Yuhong Bao on 11 Apr 2009 11:37 PM:

"There are those who would claim that DBCS does at times bite."

For example, look at the bug itself. Any chance it could be related to DBCS troubles?

Michael S. Kaplan on 13 Apr 2009 2:43 AM:

Possibly, but not necessarily. Someone would have to dig in there, it is more likely to be keyboard special handling code that does not take the IME into account. In this case DBCS is often the "ghetto" way of referring to CJK....

Yuhong Bao on 23 Apr 2009 6:13 PM:

You mistyped "DLL Help" as "DLL Hell" in another blog post:

http://blogs.msdn.com/michkap/archive/2007/09/10/4850360.aspx

Michael S. Kaplan on 27 Apr 2009 5:49 AM:

It is not mistyped -- the database had that other name when it was an internal support tool; the "nicer" name was a part of it being made public. Most people who used it before use the old name....

Yuhong Bao on 11 Oct 2010 7:54 PM:

Note that some DBCS character sets are less painful than others. For example, some DBCS character sets (like the Windows codepages) uses the ASCII range for the trail byte which include characters like backslash and brackets, while others (like the EUC series) stick with the extended ASCII range.

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day