by Michael S. Kaplan, published on 2006/06/02 23:31 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/06/02/615571.aspx
This function TERTIARY_WEIGHTS() has been in Microsoft SQL Server for a while.
Now you have perhaps read what I have had to say about tertiary weights previously in posts like this one or this one or this one (the latter is especially topical to some of what I will point out in this post -- note that lowercase is given less weight than uppercase!).
According to SQL Server's Books Online in SQLS 2005:
Returns a binary string of weights for each character in a non-Unicode string expression defined with an SQL tertiary collation.
TERTIARY_WEIGHTS( non_Unicode_character_string_expression )
Is a string expression of type char, varchar, or varchar(max) defined on a tertiary SQL collation. For a list of these collations, see Remarks.
TERTIARY_WEIGHTS returns varbinary when non_Unicode_character_string_expression is char or varchar, and returns varbinary(max) when non_Unicode_character_string_expression is varchar(max).
TERTIARY_WEIGHTS returns NULL when non_Unicode_character_string_expression is not defined with an SQL tertiary collation. The following table shows the SQL tertiary collations.
Sort order ID SQL collation
TERTIARY_WEIGHTS is intended for use in the definition of a computed column that is defined on the values of a char, varchar, or varchar(max) column. Defining an index on both the computed column and the char, varchar, or varchar(max) column can improve performance when the char, varchar, or varchar(max) column is specified in the ORDER BY clause of a query.
Hmmm.... this does not sound much like any of the tertiary stuff I was talking about. And did you notice how many CI (case insensitive) collations are listed there?
Let's try it out:
SELECT TERTIARY_WEIGHTS('ABCDEFGHIJKLMNOPQRSTUVWZYZ' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
Hmmm... well, it is not using Microsoft's weights (remember our aversion to embeded 0x01 bytes!) but if uppercase with an ignore case flag returns a low number, perhaps there is some common ground.
Let's try the lowercase letters:
SELECT TERTIARY_WEIGHTS('abcdefghijklmnopqrstuvwxyz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
Hmmm... never mind. :-)
Clearly it is only capturing case differences, and since it will not work with either Unicode types or unusual letters like U+028f (ʏ, a.k.a. LATIN LETTER SMALL CAPITAL Y), all of the numbers will be 0x01 or 0x02. Let's make sure:
SELECT TERTIARY_WEIGHTS('AbCdEfGhIjKlMnOpQeStUvWxYz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)
Ok, that looks a bit like proof, right? :-)
So this is a function that will return information about case distinctions even in case insensitive sorts, to help with index performance of non-Unicode SQL compatibility collation columns. Though how it helps is quite unclear at this point.
And to wind down on a weird note, you can try the following query:
SELECT TERTIARY_WEIGHTS('W' COLLATE SQL_Scandinavian_Pref_CP850_CI_AS),
TERTIARY_WEIGHTS('w' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS),
TERTIARY_WEIGHTS('W' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS)
0x03 0x02 0x01
I am afraid to ask what it is about the letter W in Scandinavia that would cause TERTIARY_WEIGHTS() to return a 0x03 here.
Trivia -- does anyone know how to get a 0x04? Or another byte value higher than that? :-)
Perhaps I am even more afraid of asking how on earth anything one does with this function can help performance. Does anyone have any thoughts here?
I would probably recommend giving this function a miss, in the meantime (a statement I will happily retract if someone can explain a purpose or use for this function that makes sense!).
This post brought to you by ʏ (U+028f, a.k.a. LATIN LETTER SMALL CAPITAL Y)
# Steve Kass on 20 Jul 2006 6:12 PM:
# Michael S. Kaplan on 26 Jul 2006 3:25 PM:
# John Cowan on 6 Dec 2007 12:34 PM:
I'm no SQL Server mavin, but clearly what is special about W in Scandinavia is that it's just a glyphic variant of V, mostly used in proper names nowadays. The old royal dynasty of Sweden (1523-1654) is usually spelled Vasa now, but the rye-crisps are still Wasa.
go to newer or older post, or back to index or month or day