●●●●●●●●●●●●●● isn't complex, no matter what the underlying language is

by Michael S. Kaplan, published on 2011/06/28 07:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2011/06/28/10180787.aspx

This blog you are reading was originally a very different one, based on people with questions about several forum posts like this one, this one, this one, and this other one. It was gutted and re-written after large thread that took place yesterday endd up seeming a lot more on topic....

The question was clearly written, so I will give credit for that, at least:

Hi,

We have a UI page that takes a password input. To increase the entropy we would like to ensure that the password as characteristics such as

a) Great than x characters (current 8)
b) Has at least one upper case
c) Has at least one number

We can use the GetStringType* APIs but I am not sure if for languages such as Japanese etc. if the API will help us do what we would like to, for example not sure if all languages have upper cases, numbers etc.

What will you suggest to enforce a high entropy password that will work globally?

Thanks!

My first thought was to when a similar question came up a little over three years ago (as previously described in You want to know what's weak? Strong password rules, that's what's weak!

But pretty self-aware, if you catch my drift....

It is quite easy to look at such "criteria" as being rather biased toward languages with case in them, and the question about Japanese makes things all the more complicated, truth be told. Asone colleague responded in part:

As you look at passwords in a global perspective, it seems like there are several things that could be considered besides case. For instance, suppose you expect passwords to be comprised of characters that are somehow related—e.g. they can all be generated by the same keyboard layout (seems like a good assumption), then it seems pretty significant that an Amharic input method can generate one order of magnitude more possible values for each information element than can an English input method, and a CJK input method can generate over two orders of magnitude more possible values. In other words, even though those writing systems do not have case, the set of likely keys will be 10^2^n and 10^3^n (for password length n) greater than for passwords generated using an English input method. In other words, while you don’t have case adding to the entropy, you have different factors that might potentially add much more robustness than case can provide.

Of course, there are other factors for something like CJK: since each information element can be interpreted as a word, it seems like there’d be a certain user propensity to create passwords that are mnemonic phrases—and that will reduce the entropy.

Okay, this miht be overboard, though.

Becauseone of the most interesting facts about password fields (once you ignore the fact that no one should ever be asking for passwords except through CredUI these days!) is very relevant to Amharic. And Japanese. and CJK.

Password fields turn off the IME.

I feel I should repeat that a little louder, as it seems vaguely important!

Password fields turn off the IME.

Of course this specific fact that undercuts almost every single example given doesn't attack the basic argument's soundness.

By opening up the repertoire and suggesting more reasonable additional potential ways to add the required level of complexity, one can remove the more provincial aspects of this whole policy that to me represents the most famous of all the dumb interview questions: writing a RegEx expression to validate the password string....

For example -- one can suggest muliple scripts, like with Japanese Katakana and Hiragana.

Or one can suggest both full width and half with letters, like with Korean Jamo.

Note that when they use the English keyboard for other languages (like English to get the Pinyin for Chinese), case is often not applicable there either. So requiring case often makes minimal sense, culturally.

And before you suggest that I have knocked Japanese and Korean and Chinese out of the running, I'll point out one of the most important misunderstood features for Japanese and Korean (and even Chinese, though Chinese does it differently) -- the additional non-IME keyboard that is kind of "installed" and gets used when the IME itself is disabled.

You can't put random Kanji/Hanja/Han in there, but you can certainly get the underlying Kana/Jamo/English that you would be typing.

It gets even more interesting as you consider passwords in Indic languages/scripts and typing in strings of characters that can be a lot closer to meaningless given how some strings simply will never look right if you were to look at then rendered.

Though as luck would have it, ●●●●●●●●●●●●●● isn't complex -- no matter what the underlying language is.

And I'll add one basic clarifying assumption: people will generally type a password using a single keyboard -- so you have to decide the meaning of complex as it applies to a given keyboard.

Or you can keep the current lame rules that tend to assume it's English or English-like.

In the long run, I think we need to mix it up a bit, we need to add some additional entries to the recommended Password Complexity Requirements or even better the more recently updated Passwords must meet complexity requirements:

Passwords must contain characters from three of the following five categories:

Uppercase characters of European languages (A through Z, with diacritic marks, Greek and Cyrillic characters)
Lowercase characters of European languages (a through z, sharp-s, with diacritic marks, Greek and Cyrillic characters)
Base 10 digits (0 through 9)
Nonalphanumeric characters: ~!#$%^&*_-+=`|\(){}[]:;"'<>,.?/ @
Any Unicode character that is categorized as an alphabetic character but is not uppercase or lowercase. This includes Unicode characters from Asian languages.

Now look, the article makes it clear that they are using GetStringTypeW to make their decisions, which means that a whole lot of Unicode is available here -- and the current restrictions seem tailored to support some Japanese and Korean just as well as "Western" cased scripts. So we are already part way there....

So now let's just take it the rest of the way and open up this list to even more possibilities!

Raymond Chen - MSFT on 28 Jun 2011 7:49 AM:

Be careful what you wish for: blogs.msdn.com/10171014.aspx

Imagine being a telephone support person and doing an Amharic+CJK password reset over the phone.

Jeffrey L. Whitledge on 28 Jun 2011 10:11 AM:

As soon as they said, "To increase the entropy we would like to [reduce the total space of allowable passwords]" my brain exploded.

Joshua on 28 Jun 2011 11:23 AM:

I wonder who else noticed that article hasn't been scored yet.

Michael S. Kaplan on 28 Jun 2011 12:08 PM:

Scored?

Joshua on 29 Jun 2011 9:41 AM:

> Scoring will be as follows:

But since you forgot about it.

Michael S. Kaplan on 29 Jun 2011 11:48 AM:

Oh that. I gave up when I gave up on the whole scoring thing, but one side was winning and they were on a roll at that point anyway....

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day