by Michael S. Kaplan, published on 2008/11/25 10:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/11/25/9140152.aspx
So it all started in a conversation with some of the folks from the SQL Server team when I was at PASS.
What did they expect? Send 600 Microsoft employees to an event and you are bound to run into some of them just in the act of walking around!
They were telling me about an interesting bug in a couple collations in SQL Server 2008 -- bugs that only impact non-Unicode columns for those couple of collations.
Apparently they got a couple code page values wrong.
This was "Azeri Serious Bug".
Once I was back in front of my computer, I ran the following query:
SELECT
name,
COLLATIONPROPERTY(name, 'CodePage') As CodePage,
CONVERT(binary(4), COLLATIONPROPERTY(name, 'LCID')) As LCID,
CONVERT(binary(4), COLLATIONPROPERTY(name, 'ComparisonStyle')) As ComparisonStyle,
description
FROM
::fn_helpcollations()
WHERE
name LIKE '%Azer%' OR name LIKE '%Turki%'
in SQL Server 2005:
Azeri_Cyrillic_90_BIN 1251 0x0000082C 0x00000000 Azeri-Cyrillic-90, binary sort
Azeri_Cyrillic_90_BIN2 1251 0x0000082C 0x00000000 Azeri-Cyrillic-90, binary code point comparison sort
Azeri_Cyrillic_90_CI_AI 1251 0x0000082C 0x00030003 Azeri-Cyrillic-90, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_90_CI_AI_WS 1251 0x0000082C 0x00010003 Azeri-Cyrillic-90, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_90_CI_AI_KS 1251 0x0000082C 0x00020003 Azeri-Cyrillic-90, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_90_CI_AI_KS_WS 1251 0x0000082C 0x00000003 Azeri-Cyrillic-90, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_90_CI_AS 1251 0x0000082C 0x00030001 Azeri-Cyrillic-90, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_90_CI_AS_WS 1251 0x0000082C 0x00010001 Azeri-Cyrillic-90, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_90_CI_AS_KS 1251 0x0000082C 0x00020001 Azeri-Cyrillic-90, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_90_CI_AS_KS_WS 1251 0x0000082C 0x00000001 Azeri-Cyrillic-90, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_90_CS_AI 1251 0x0000082C 0x00030002 Azeri-Cyrillic-90, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_90_CS_AI_WS 1251 0x0000082C 0x00010002 Azeri-Cyrillic-90, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_90_CS_AI_KS 1251 0x0000082C 0x00020002 Azeri-Cyrillic-90, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_90_CS_AI_KS_WS 1251 0x0000082C 0x00000002 Azeri-Cyrillic-90, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_90_CS_AS 1251 0x0000082C 0x00030000 Azeri-Cyrillic-90, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_90_CS_AS_WS 1251 0x0000082C 0x00010000 Azeri-Cyrillic-90, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_90_CS_AS_KS 1251 0x0000082C 0x00020000 Azeri-Cyrillic-90, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_90_CS_AS_KS_WS 1251 0x0000082C 0x00000000 Azeri-Cyrillic-90, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_90_BIN 1254 0x0000042C 0x00000000 Azeri-Latin-90, binary sort
Azeri_Latin_90_BIN2 1254 0x0000042C 0x00000000 Azeri-Latin-90, binary code point comparison sort
Azeri_Latin_90_CI_AI 1254 0x0000042C 0x00030003 Azeri-Latin-90, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_90_CI_AI_WS 1254 0x0000042C 0x00010003 Azeri-Latin-90, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_90_CI_AI_KS 1254 0x0000042C 0x00020003 Azeri-Latin-90, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_90_CI_AI_KS_WS 1254 0x0000042C 0x00000003 Azeri-Latin-90, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_90_CI_AS 1254 0x0000042C 0x00030001 Azeri-Latin-90, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_90_CI_AS_WS 1254 0x0000042C 0x00010001 Azeri-Latin-90, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_90_CI_AS_KS 1254 0x0000042C 0x00020001 Azeri-Latin-90, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_90_CI_AS_KS_WS 1254 0x0000042C 0x00000001 Azeri-Latin-90, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_90_CS_AI 1254 0x0000042C 0x00030002 Azeri-Latin-90, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_90_CS_AI_WS 1254 0x0000042C 0x00010002 Azeri-Latin-90, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_90_CS_AI_KS 1254 0x0000042C 0x00020002 Azeri-Latin-90, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_90_CS_AI_KS_WS 1254 0x0000042C 0x00000002 Azeri-Latin-90, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_90_CS_AS 1254 0x0000042C 0x00030000 Azeri-Latin-90, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_90_CS_AS_WS 1254 0x0000042C 0x00010000 Azeri-Latin-90, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_90_CS_AS_KS 1254 0x0000042C 0x00020000 Azeri-Latin-90, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_90_CS_AS_KS_WS 1254 0x0000042C 0x00000000 Azeri-Latin-90, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_BIN 1254 0x0000041F 0x00000000 Turkish, binary sort
Turkish_BIN2 1254 0x0000041F 0x00000000 Turkish, binary code point comparison sort
Turkish_CI_AI 1254 0x0000041F 0x00030003 Turkish, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_CI_AI_WS 1254 0x0000041F 0x00010003 Turkish, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_CI_AI_KS 1254 0x0000041F 0x00020003 Turkish, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_CI_AI_KS_WS 1254 0x0000041F 0x00000003 Turkish, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_CI_AS 1254 0x0000041F 0x00030001 Turkish, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_CI_AS_WS 1254 0x0000041F 0x00010001 Turkish, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_CI_AS_KS 1254 0x0000041F 0x00020001 Turkish, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_CI_AS_KS_WS 1254 0x0000041F 0x00000001 Turkish, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_CS_AI 1254 0x0000041F 0x00030002 Turkish, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_CS_AI_WS 1254 0x0000041F 0x00010002 Turkish, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_CS_AI_KS 1254 0x0000041F 0x00020002 Turkish, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_CS_AI_KS_WS 1254 0x0000041F 0x00000002 Turkish, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_CS_AS 1254 0x0000041F 0x00030000 Turkish, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_CS_AS_WS 1254 0x0000041F 0x00010000 Turkish, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_CS_AS_KS 1254 0x0000041F 0x00020000 Turkish, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_CS_AS_KS_WS 1254 0x0000041F 0x00000000 Turkish, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
and in SQL Server 2008:
Azeri_Cyrillic_100_BIN 1251 0x0000082C 0x00000000 Azeri-Cyrillic-100, binary sort
Azeri_Cyrillic_100_BIN2 1251 0x0000082C 0x00000000 Azeri-Cyrillic-100, binary code point comparison sort
Azeri_Cyrillic_100_CI_AI 1251 0x0000082C 0x00030003 Azeri-Cyrillic-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_100_CI_AI_WS 1251 0x0000082C 0x00010003 Azeri-Cyrillic-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_100_CI_AI_KS 1251 0x0000082C 0x00020003 Azeri-Cyrillic-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_100_CI_AI_KS_WS 1251 0x0000082C 0x00000003 Azeri-Cyrillic-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_100_CI_AS 1251 0x0000082C 0x00030001 Azeri-Cyrillic-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_100_CI_AS_WS 1251 0x0000082C 0x00010001 Azeri-Cyrillic-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_100_CI_AS_KS 1251 0x0000082C 0x00020001 Azeri-Cyrillic-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_100_CI_AS_KS_WS 1251 0x0000082C 0x00000001 Azeri-Cyrillic-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_100_CS_AI 1251 0x0000082C 0x00030002 Azeri-Cyrillic-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_100_CS_AI_WS 1251 0x0000082C 0x00010002 Azeri-Cyrillic-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_100_CS_AI_KS 1251 0x0000082C 0x00020002 Azeri-Cyrillic-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_100_CS_AI_KS_WS 1251 0x0000082C 0x00000002 Azeri-Cyrillic-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Cyrillic_100_CS_AS 1251 0x0000082C 0x00030000 Azeri-Cyrillic-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Cyrillic_100_CS_AS_WS 1251 0x0000082C 0x00010000 Azeri-Cyrillic-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Cyrillic_100_CS_AS_KS 1251 0x0000082C 0x00020000 Azeri-Cyrillic-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Cyrillic_100_CS_AS_KS_WS 1251 0x0000082C 0x00000000 Azeri-Cyrillic-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_100_BIN 1254 0x0000042C 0x00000000 Azeri-Latin-100, binary sort
Azeri_Latin_100_BIN2 1254 0x0000042C 0x00000000 Azeri-Latin-100, binary code point comparison sort
Azeri_Latin_100_CI_AI 1254 0x0000042C 0x00030003 Azeri-Latin-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_100_CI_AI_WS 1254 0x0000042C 0x00010003 Azeri-Latin-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_100_CI_AI_KS 1254 0x0000042C 0x00020003 Azeri-Latin-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_100_CI_AI_KS_WS 1254 0x0000042C 0x00000003 Azeri-Latin-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_100_CI_AS 1254 0x0000042C 0x00030001 Azeri-Latin-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_100_CI_AS_WS 1254 0x0000042C 0x00010001 Azeri-Latin-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_100_CI_AS_KS 1254 0x0000042C 0x00020001 Azeri-Latin-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_100_CI_AS_KS_WS 1254 0x0000042C 0x00000001 Azeri-Latin-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_100_CS_AI 1254 0x0000042C 0x00030002 Azeri-Latin-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_100_CS_AI_WS 1254 0x0000042C 0x00010002 Azeri-Latin-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_100_CS_AI_KS 1254 0x0000042C 0x00020002 Azeri-Latin-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_100_CS_AI_KS_WS 1254 0x0000042C 0x00000002 Azeri-Latin-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Azeri_Latin_100_CS_AS 1254 0x0000042C 0x00030000 Azeri-Latin-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Azeri_Latin_100_CS_AS_WS 1254 0x0000042C 0x00010000 Azeri-Latin-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Azeri_Latin_100_CS_AS_KS 1254 0x0000042C 0x00020000 Azeri-Latin-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Azeri_Latin_100_CS_AS_KS_WS 1254 0x0000042C 0x00000000 Azeri-Latin-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_BIN 1254 0x0000041F 0x00000000 Turkish, binary sort
Turkish_BIN2 1254 0x0000041F 0x00000000 Turkish, binary code point comparison sort
Turkish_CI_AI 1254 0x0000041F 0x00030003 Turkish, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_CI_AI_WS 1254 0x0000041F 0x00010003 Turkish, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_CI_AI_KS 1254 0x0000041F 0x00020003 Turkish, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_CI_AI_KS_WS 1254 0x0000041F 0x00000003 Turkish, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_CI_AS 1254 0x0000041F 0x00030001 Turkish, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_CI_AS_WS 1254 0x0000041F 0x00010001 Turkish, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_CI_AS_KS 1254 0x0000041F 0x00020001 Turkish, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_CI_AS_KS_WS 1254 0x0000041F 0x00000001 Turkish, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_CS_AI 1254 0x0000041F 0x00030002 Turkish, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_CS_AI_WS 1254 0x0000041F 0x00010002 Turkish, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_CS_AI_KS 1254 0x0000041F 0x00020002 Turkish, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_CS_AI_KS_WS 1254 0x0000041F 0x00000002 Turkish, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_CS_AS 1254 0x0000041F 0x00030000 Turkish, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_CS_AS_WS 1254 0x0000041F 0x00010000 Turkish, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_CS_AS_KS 1254 0x0000041F 0x00020000 Turkish, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_CS_AS_KS_WS 1254 0x0000041F 0x00000000 Turkish, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_100_BIN 1254 0x0000041F 0x00000000 Turkish-100, binary sort
Turkish_100_BIN2 1254 0x0000041F 0x00000000 Turkish-100, binary code point comparison sort
Turkish_100_CI_AI 1254 0x0000041F 0x00030003 Turkish-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_100_CI_AI_WS 1254 0x0000041F 0x00010003 Turkish-100, case-insensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_100_CI_AI_KS 1254 0x0000041F 0x00020003 Turkish-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_100_CI_AI_KS_WS 1254 0x0000041F 0x00000003 Turkish-100, case-insensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_100_CI_AS 1254 0x0000041F 0x00030001 Turkish-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_100_CI_AS_WS 1254 0x0000041F 0x00010001 Turkish-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_100_CI_AS_KS 1254 0x0000041F 0x00020001 Turkish-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_100_CI_AS_KS_WS 1254 0x0000041F 0x00000001 Turkish-100, case-insensitive, accent-sensitive, kanatype-sensitive, width-sensitive
Turkish_100_CS_AI 1254 0x0000041F 0x00030002 Turkish-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-insensitive
Turkish_100_CS_AI_WS 1254 0x0000041F 0x00010002 Turkish-100, case-sensitive, accent-insensitive, kanatype-insensitive, width-sensitive
Turkish_100_CS_AI_KS 1254 0x0000041F 0x00020002 Turkish-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-insensitive
Turkish_100_CS_AI_KS_WS 1254 0x0000041F 0x00000002 Turkish-100, case-sensitive, accent-insensitive, kanatype-sensitive, width-sensitive
Turkish_100_CS_AS 1254 0x0000041F 0x00030000 Turkish-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive
Turkish_100_CS_AS_WS 1254 0x0000041F 0x00010000 Turkish-100, case-sensitive, accent-sensitive, kanatype-insensitive, width-sensitive
Turkish_100_CS_AS_KS 1254 0x0000041F 0x00020000 Turkish-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-insensitive
Turkish_100_CS_AS_KS_WS 1254 0x0000041F 0x00000000 Turkish-100, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive
but all of the code page values looked right!
Hmmm.
I went back and asked for more information, and then I got the clues I should have seen the first time:
And with this information I stumbled on a fact that I did not realize even though it was right in front of me.
The fn_helpcollations function wasn't returning deprecated collations!
It would have helped for me to look at the SQL Server 2008 version of the topic, which had the following note:
Deprecated collations that are no longer supported or installed in SQL Server 2008 cannot be queried with fn_helpcollations.
Aha!
though the older collation is actually still there; you just can't enumerate it. I was able to use it in T-SQL just fine for several operations.
And then I ran the following query:
SELECT
Name,
COLLATIONPROPERTY(Name, 'CodePage') As CodePage,
CONVERT(binary(4), COLLATIONPROPERTY(Name, 'LCID')) As LCID
FROM
(VALUES ('Azeri_Latin_90_BIN'),
('Azeri_Latin_100_BIN'),
('Azeri_Cyrillic_90_BIN'),
('Azeri_Cyrillic_100_BIN'),
('Turkish_BIN'),
('Turkish_100_BIN')
) AS Collations(Name);
which returned the following results (incorrect ones that do not match the SQL Server 2005 values in red):
Azeri_Latin_90_BIN 1251 0x0000042C
Azeri_Latin_100_BIN 1254 0x0000042C
Azeri_Cyrillic_90_BIN 1254 0x0000082C
Azeri_Cyrillic_100_BIN 1251 0x0000082C
Turkish_BIN 1254 0x0000041F
Turkish_100_BIN 1254 0x0000041F
And there's the bug they were talking about.
Well, that and the documentation bug that claims that these deprecated collations aren't installed. I think what they were trying to say was that they were not available in setup to choose as the default collation, and it is just bbeing described in a weird way -- since clearly the old ones are still round (and would have to be or no database using them could be upgraded properly!).
and if you use Unicode columns then this bug won't matter.
But if you use CHAR, VARCHAR, or TEXT columns with this collation, your results will be wrong -- very very wrong....
Let's look at the code pages themselves to see how wrong, and where. First code page 1254:
and then code page 1251:
and you can see that while the Latin Azeri will be missing some characters, the Cyrillic Azeri will be missing pretty much all the ones you need.
Okay, a serious bug -- one that has to be fixed.
The easy fix is to change to the newer Azeri collations, but this can be complicated by anything that tries to map the characters through Unicode first since the data they have on the mappings is wrong here!
Not to mention users who notice the problem and just use the wrong collation that works correctly -- fixing the bug could even break a few people....
Now you may (if you are a regular reader) remember 100% roundtrip ASCII? 100% roundtrip ANSI?, which talks about missing characters in the two code pages.
In this case, code page 1254 is missing seven bytes:
0x81 0x8D 0x8E 0x8F 0x90 0x9D 0x9E
which if they were on the Cyrillic code page would be:
- U+0403 -- Ѓ (aka CYRILLIC CAPITAL LETTER GJE)
- U+040c -- Ќ (aka CYRILLIC CAPITAL LETTER KJE)
- U+040b -- Ћ (aka CYRILLIC CAPITAL LETTER TSHE)
- U+040f -- Џ (aka CYRILLIC CAPITAL LETTER DZHE)
- U+0452 -- ђ (aka CYRILLIC SMALL LETTER DJE)
- U+045c -- ќ (aka CYRILLIC SMALL LETTER KJE)
- U+045b -- ћ (aka CYRILLIC SMALL LETTER TSHE)
While code page 1251 is just missing one byte:
0x98
which if they were in the Latin code page would be:
- U+02dc -- ˜ (aka SMALL TILDE)
Once again, it is indeed the Cyrillic code page that suffers the most, though in the case of improperly stored data it might have been the easiest to detect being on the wrong code page, if these letters were being used in Cyrillic Azeri (which according to pages like this one they don't really seem to be!).
In the end, this bug becomes just another incident where due to the fact that Unicode is what is being emphasized, bugs in non-Unicode code paths slip through.
This might be another good reson to support Unicode, everyone!
I'll probably talk more about deprecation here, another day. This was an interesting and not-wrong (in my opinion) design direction that deserves some real understanding and elucidation....
This blog brought to you by the eight above mentioned non-Azeri Unicode characters that have been pulled into some degree of Azeriness...
# Michael Ratanapintha on 25 Nov 2008 7:27 PM:
I guess I am being a bit stupid here, but I was wondering what the "::" operator in your T-SQL code does. Is it some kind of scope resolution operator, as the C++ analogy indicates? Or does it do some other function?
# Michael S. Kaplan on 25 Nov 2008 9:40 PM:
I don't think that makes you stupid. :-)
This is something I was in the first UDF sample and have just used ever since, I assume it is an analogous scope resolution operator. I just seem to do it out of habit....