by Michael S. Kaplan, published on 2006/06/02 23:31 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/06/02/615571.aspx
This function TERTIARY_WEIGHTS() has been in Microsoft SQL Server for a while.
Now you have perhaps read what I have had to say about tertiary weights previously in posts like this one or this one or this one (the latter is especially topical to some of what I will point out in this post -- note that lowercase is given less weight than uppercase!).
According to SQL Server's Books Online in SQLS 2005:
Returns a binary string of weights for each character in a non-Unicode string expression defined with an SQL tertiary collation.
Syntax
TERTIARY_WEIGHTS( non_Unicode_character_string_expression )
Arguments
non_Unicode_character_string_expression
Is a string expression of type char, varchar, or varchar(max) defined on a tertiary SQL collation. For a list of these collations, see Remarks.Return Types
TERTIARY_WEIGHTS returns varbinary when non_Unicode_character_string_expression is char or varchar, and returns varbinary(max) when non_Unicode_character_string_expression is varchar(max).Remarks
TERTIARY_WEIGHTS returns NULL when non_Unicode_character_string_expression is not defined with an SQL tertiary collation. The following table shows the SQL tertiary collations.Sort order ID SQL collation
33 SQL_Latin1_General_Pref_CP437_CI_AS
34 SQL_Latin1_General_CP437_CI_AI
43 SQL_Latin1_General_Pref_CP850_CI_AS
44 SQL_Latin1_General_CP850_CI_AI
49 SQL_1xCompat_CP850_CI_AS
53 SQL_Latin1_General_Pref_CP1_CI_AS
54 SQL_Latin1_General_CP1_CI_AI
56 SQL_AltDiction_Pref_CP850_CI_AS
57 SQL_AltDiction_CP850_CI_AI
58 SQL_Scandinavian_Pref_CP850_CI_AS
82 SQL_Latin1_General_CP1250_CI_AS
84 SQL_Czech_CP1250_CI_AS
86 SQL_Hungarian_CP1250_CI_AS
88 SQL_Polish_CP1250_CI_AS
90 SQL_Romanian_CP1250_CI_AS
92 SQL_Croatian_CP1250_CI_AS
94 SQL_Slovak_CP1250_CI_AS
96 SQL_Slovenian_CP1250_CI_AS
106 SQL_Latin1_General_CP1251_CI_AS
108 SQL_Ukrainian_CP1251_CI_AS
113 SQL_Latin1_General_CP1253_CS_AS
114 SQL_Latin1_General_CP1253_CI_AS
130 SQL_Latin1_General_CP1254_CI_AS
146 SQL_Latin1_General_CP1256_CI_AS
154 SQL_Latin1_General_CP1257_CI_AS
156 SQL_Estonian_CP1257_CI_AS
158 SQL_Latvian_CP1257_CI_AS
160 SQL_Lithuanian_CP1257_CI_AS
183 SQL_Danish_Pref_CP1_CI_AS
184 SQL_SwedishPhone_Pref_CP1_CI_AS
185 SQL_SwedishStd_Pref_CP1_CI_AS
186 SQL_Icelandic_Pref_CP1_CI_ASTERTIARY_WEIGHTS is intended for use in the definition of a computed column that is defined on the values of a char, varchar, or varchar(max) column. Defining an index on both the computed column and the char, varchar, or varchar(max) column can improve performance when the char, varchar, or varchar(max) column is specified in the ORDER BY clause of a query.
Hmmm.... this does not sound much like any of the tertiary stuff I was talking about. And did you notice how many CI (case insensitive) collations are listed there?
Let's try it out:
SELECT TERTIARY_WEIGHTS('ABCDEFGHIJKLMNOPQRSTUVWZYZ' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
0x0101010101010101010101010101010101010101010101010101
Hmmm... well, it is not using Microsoft's weights (remember our aversion to embeded 0x01 bytes!) but if uppercase with an ignore case flag returns a low number, perhaps there is some common ground.
Let's try the lowercase letters:
SELECT TERTIARY_WEIGHTS('abcdefghijklmnopqrstuvwxyz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
0x0202020202020202020202020202020202020202020202020202
Hmmm... never mind. :-)
Clearly it is only capturing case differences, and since it will not work with either Unicode types or unusual letters like U+028f (ʏ, a.k.a. LATIN LETTER SMALL CAPITAL Y), all of the numbers will be 0x01 or 0x02. Let's make sure:
SELECT TERTIARY_WEIGHTS('AbCdEfGhIjKlMnOpQeStUvWxYz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
0x0102010201020102010201020102010201020102010201020102
Ok, that looks a bit like proof, right? :-)
So this is a function that will return information about case distinctions even in case insensitive sorts, to help with index performance of non-Unicode SQL compatibility collation columns. Though how it helps is quite unclear at this point.
And to wind down on a weird note, you can try the following query:
SELECT TERTIARY_WEIGHTS('W' COLLATE SQL_Scandinavian_Pref_CP850_CI_AS),
TERTIARY_WEIGHTS('w' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS),
TERTIARY_WEIGHTS('W' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS)
0x03 0x02 0x01
I am afraid to ask what it is about the letter W in Scandinavia that would cause TERTIARY_WEIGHTS() to return a 0x03 here.
Trivia -- does anyone know how to get a 0x04? Or another byte value higher than that? :-)
Perhaps I am even more afraid of asking how on earth anything one does with this function can help performance. Does anyone have any thoughts here?
I would probably recommend giving this function a miss, in the meantime (a statement I will happily retract if someone can explain a purpose or use for this function that makes sense!).
This post brought to you by ʏ (U+028f, a.k.a. LATIN LETTER SMALL CAPITAL Y)
# Steve Kass on 20 Jul 2006 6:12 PM:
# Michael S. Kaplan on 26 Jul 2006 3:25 PM:
# John Cowan on 6 Dec 2007 12:34 PM:
I'm no SQL Server mavin, but clearly what is special about W in Scandinavia is that it's just a glyphic variant of V, mostly used in proper names nowadays. The old royal dynasty of Sweden (1523-1654) is usually spelled Vasa now, but the rye-crisps are still Wasa.
referenced by