In SQL Server, your ranges also need to ACCENT-uate the positives!

by Michael S. Kaplan, published on 2007/12/08 10:16 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/12/08/6669846.aspx

You might have started to sense a pattern developing here with the last few posts in this series:

In SQL Server, the distance between A and Z is wider than you might think
In SQL Server, A-Z, A-z, a-Z, and a-z may not mean the same thing
In SQL Server, the wrong range can make you seem insensitive to one's width! (aka Do my V's look fat?)

That last post pointed out two characters that would be missing if you have a width sensitive collation.

and now this post is going to talk about accent sensitivity and it's impact.

This will add over two dozen characters like the following ones:

ẑ U+1e91 LATIN SMALL LETTER Z WITH CIRCUMFLEX
Ż U+017b LATIN CAPITAL LETTER Z WITH DOT ABOVE
ź U+017a LATIN SMALL LETTER Z WITH ACUTE
ᴢ U+1d22 LATIN LETTER SMALL CAPITAL Z
Ȥ U+0224 LATIN CAPITAL LETTER Z WITH HOOK
ɀ U+0240 LATIN SMALL LETTER Z WITH SWASH TAIL

and more! All of which have some sort of "Z-ish" quality about them, none of which will be handled by the ranges defined thus far according to the documentation for SAL Server if your application is run in the context of an accent sensitive collation.

Oops.

You get the point - this odd combination of actual linguistic characters used in language, phonetic symbols, and circled/parenthesize letters have a fundamental identity that users would expect to sort somewhere not far from the Z by default.

SQL Server does that well -- how about your application with your LIKE range? :-)

And just in case you thought I was done, it gets worse tomorrow....

This post brought to you by ⒵ (U+24b5, aka PARENTHESIZED LATIN SMALL LETTER Z, the last Z-like thing in the Latin script....)

no comments

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

referenced by

2007/12/11 In SQL Server, there is the rest of Unicode (aka the SiaO Incompleteness Theorem)

2007/12/10 In SQL Server, different collations implies different ranges (aka Not every table has its THORN)

2007/12/09 In SQL Server, the alphabet does not end at Z!

go to newer or older post, or back to index or month or day