Since nobody @#%&*! owns en-US…

by Michael S. Kaplan, published on 2011/06/14 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/06/14/10174165.aspx


Long time readers may recall Who owns English, exactly? from back in November 2006 where I talked about the philosophical issues underlying who was the "owner" of transliterations and/or translations of native language names into English.

Everything I said there was of course true, and the issues are of course very complicated.

But there is an additional factor to be considered here too, on much wider than the "corner case" of translated/transliterated language names.

The factor can be summed up in a simple question:

Who the @#%&*! owns en-US?

I mean, it's not like we have an English Language Academy we are beholden to, and there is no special ANSI standard we claim to be compliant with.

The original locale was put together by people in the US, sure. And English Windows has never suffered from not enough beta testers.

But all of that is hardly official, is it?

This occurred to me the other day when someone asked me via mail:

You have the value for LOCALE_IPOSITIVEPERCENT for the US locales as 0, shouldn’t this be 1?

Now the LOCALE_IPOSITIVEPERCENT lctype documentation delineates the choices pretty clearly:

ValueFormat
0 Number, space, percent; for example, # %
1 Number, percent; for example, #%
2 Percent, number; for example, %#
3 Percent, space, number; for example, % #

Now I have lived in the USA my entire life, though I have traveled abroad a lot (enough to make filling out an SF-86 ti be too troublesome to ever bother trying to get cleared!).

If you showed me a formatted string like

%50

or

% 50

I'd know it was wrong. But looking at

50%

vs.

50 % 

I could go either way; I see both of them, all the time.

To be honest I don't like either of them; I think

50 %

or

50 %

using a HAIR SPACE (U+2009) or a THIN SPACE (U+200a) in between is better than any of the other choices -- between the original two "not wrong" choices, one is too close and the other is too far

But we don't have either as an option.

And it's not like we have an authority we can ask.

Since nobody @#%&*! owns en-US....


Michael S. Kaplan on 14 Jun 2011 8:26 AM:

At least the blog censors/filters didn't catch that @#%&*! I used!

Joshua on 14 Jun 2011 8:31 AM:

Back-compatibility here: en-US has to fit in 7bit ASCII so hair space cannot be used.

Michael S. Kaplan on 14 Jun 2011 8:34 AM:

The percent formatting was not introduced until Windows 7. :-)

But I doubt people want to add these constants, so....

Richard on 14 Jun 2011 8:58 AM:

But since England obviously owns "en" - since we're the people who invented the @#%&*! language! - that must mean we own any unclaimed "en-*" cultures, right?! :o)

Skip on 14 Jun 2011 9:40 AM:

Well FWIW, I don't think I've ever once seen the n% with a space included, so on my last .net project I ended up having to "fix" that everywhere I displayed percentages.   I always just assumed it was a bug in the .net layer, I guess it actually inherits it from lower down though.

Michael S. Kaplan on 14 Jun 2011 12:28 PM:

Well, it's only inherited in .Net 4.0 running on Win7....

Henry on 15 Jun 2011 2:48 AM:

Aren't they breaking? I'd like to have some non-breaking space U+00A0, U+202F, or (why not?) U+FEFF (U+2060) within numbers and their units.

Jan Kučera on 17 Jun 2011 2:35 PM:

Speaking of the Czech language, both 50 % and 50% are correct here. However, the meaning is different: the former is a standard number, while the latter is an adjective. Their usage is therefore different and they cannot be exchanged. I wonder how this fits in a global per-locale setting - it would probably make sense if it could be used as a number formatting parameter...


go to newer or older post, or back to index or month or day