Confusion about HttpRequest.UserLanguages

by Michael S. Kaplan, published on 2005/09/13 00:01 -07:00, original URI: http://blogs.msdn.com/michkap/archive/2005/09/13/464393.aspx


Shaun mentioned to me:

I just noticed that HttpRequest.UserLanguages says that it returns the user languages in preferential order but does not specify the format in which it returns the languages.

A novice programmer might expect it to return English, Spanish, etc. but this is wrong. It returns the culture abbreviation like the one used in CultureInfo(en-US,etc.). It would be really nice if the documentation specified this. Suprisingly, this hasn't changed even in the 2.0 release.

Looking at the HttpRequest.UserLanguages, I see what you mean -- that is not mentioned. But I do not know if people would really be expeting to get language names, because that is not too terribly useful. But if you pass that name to create a CultureInfo, you can pick up the EnglishName or the NativeName or any of the other properties. Including ones that will provide actual functionality for parsing or fomatting....

But I will make a suggestion to the doc. folks, perhapss they could mention some of these ideas....


# AndrewSeven on Tuesday, September 13, 2005 8:53 AM:

In my experience it is not a reliable setting: Users can enter arbitrary values, the setting they have in IE may not really be what they would like on the site,you might not support the language they have chosen, and you can't set it if they change languages.

That being said, it can be used to select the initial languge to render the "language selection" page in. :)
-A

# Jeff Parker on Tuesday, September 13, 2005 11:23 AM:

I have to agree with Andrew, this is typically not a reliable setting. I do a lot of multicultural apps, and web apps are more my specialty and in all my years even classic asp this was never a reliable setting. You get 2 character culture codes, 3 character culture codes, and 4 character culture codes, and then sometimes multiple combinations of each. There is no real reliable way of doing this. For example I collect as data to analyze later and this is an actual language string that came across today en-us,de;q=0.7,nl-be;q=0.3 So now I have English US which I support, something in German, they aren’t picky what but I also support German, an ISO Character Set, I do not analyze this this really does me no good as a culture, then Dutch Belgium which I also support, and then another iso character set.

What I do is present them with the language that lists first if it is supported by .net, I do get some culture codes not supported by 1.1 on occasion, then check if I support that culture, then I then write that culture to a cookie and then also store to a session variable. Then allow them to change their cookie at any time through the web interface. Then read from the cookie from there. If the cookie does not exist ever then I go through the process again of taking what ever is in line first. If for some reason there is no culture, which happens as well, or they have no cultures supplied that I can support then they get English US by default and they can pick one from the list. But the point of this is you can never ever rely on the culture string in there you need to give the user a choice on your website.

# Michael S. Kaplan on Tuesday, September 13, 2005 11:28 AM:

Well, it may not be a perfect setting. In fact, it is obviously not one. But it is a start, and considering what a pain it is to parse the HTTP accept language for the small gain it can give, having something parse it for you makes it that much easier, right?

Remember, it is either based on their settings, or on what they explicitly change it to. And neither is bad as a first guess....

# Mihai on Tuesday, September 13, 2005 12:30 PM:

As a programmer I would not expect something human-readable. Because is about Internet, I would expect it to be covered by a RFC.
And it is: RFC 2616 (http://www.faqs.org/rfcs/rfc2616.html)
("Hypertext Transfer Protocol -- HTTP/1.1")
Good read:
3.4 Character Sets
3.10 Language Tags
14.2 Accept-Charset
14.3 Accept-Encoding
14.4 Accept-Language

Jeff--14.4 in RFC 2616 also explains the q=... thing (which is about "quality", not about iso charsets). It also give some hints on what you should return.
In general, the first match should be ok.

# Jeff Parker on Tuesday, September 13, 2005 12:32 PM:

True it definately isn't a perfect setting but it does give us something to go on. One thing I would prefer seeing though in the documentation is a best practices in dealing with that string. I think that would be a more helpful thing. There are a lot of various opinions scattered about and I have my own as well. Maybe IE 7 will help with this as it sends the string.

Up to about the year 2000 pretty much every thing I programmed was english US. After 2000 rolled around it seems that international support was basically a must do anymore. I can't tell you exactly where and when or why but it seems then every single thing I had learned about muti cultural apps in my own experimnetation and playing was finally going to use and now I can not build an app without it. Which in turn the culture stuff put into the .net frame work and the the release of this book http://www.microsoft.com/MSPress/books/5717.asp has been a great help and advancement in doing so, but things like the browser still cause us problems that force us web developers to use some good old fashioned tricks like me storing a cookie or http://www.wikipedia.org/ just having you choose when you go in.

# Jeff Parker on Tuesday, September 13, 2005 4:50 PM:

Thanks Mihai,

That helped explain a lot of it, all this time I had thought it was a Character setting not a quality setting. I mean I never thought about the quality of the language. I knew he quality preferences on audio and so on never thought it was a quality on the language.

# Mihai on Friday, September 16, 2005 2:12 AM:

Just in case someone stumbles on this at some point:

I say in my post "In general, the first match should be ok." I should add (to make it clear) that this happens for locales with the same quality (if missing, then it is 1, max).

So, if I have fr,q=.2;de;q=.7 one should return German, because it has higher priority.

Also, if one asks for fr-CA, one might try to offer fr-CA, then fr, then any fr-?? (fr-FR for instance), before selecting the next locale.
The RFC is more detailed, go there for the clarity I do not have :-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day