*by Michael S. Kaplan, published on 2006/06/02 23:31 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/06/02/615571.aspx*

This function TERTIARY_WEIGHTS() has been in Microsoft SQL Server for a while.

Now you have perhaps read what I have had to say about tertiary weights previously in posts like **this one** or **this one** or **this one** (the latter is especially topical to some of what I will point out in this post -- note that lowercase is given less weight than uppercase!).

According to SQL Server's Books Online in SQLS 2005:

Returns a binary string of weights for each character in a non-Unicode string expression defined with an SQL tertiary collation.

Syntax

TERTIARY_WEIGHTS( non_Unicode_character_string_expression )

Arguments

non_Unicode_character_string_expression

Is a string expression of type char, varchar, or varchar(max) defined on a tertiary SQL collation. For a list of these collations, see Remarks.

Return TypesTERTIARY_WEIGHTS returns varbinary when non_Unicode_character_string_expression is char or varchar, and returns varbinary(max) when non_Unicode_character_string_expression is varchar(max).

Remarks

TERTIARY_WEIGHTS returns NULL when non_Unicode_character_string_expression is not defined with an SQL tertiary collation. The following table shows the SQL tertiary collations.Sort order ID SQL collation

33 SQL_Latin1_General_Pref_CP437_CI_AS

34 SQL_Latin1_General_CP437_CI_AI

43 SQL_Latin1_General_Pref_CP850_CI_AS

44 SQL_Latin1_General_CP850_CI_AI

49 SQL_1xCompat_CP850_CI_AS

53 SQL_Latin1_General_Pref_CP1_CI_AS

54 SQL_Latin1_General_CP1_CI_AI

56 SQL_AltDiction_Pref_CP850_CI_AS

57 SQL_AltDiction_CP850_CI_AI

58 SQL_Scandinavian_Pref_CP850_CI_AS

82 SQL_Latin1_General_CP1250_CI_AS

84 SQL_Czech_CP1250_CI_AS

86 SQL_Hungarian_CP1250_CI_AS

88 SQL_Polish_CP1250_CI_AS

90 SQL_Romanian_CP1250_CI_AS

92 SQL_Croatian_CP1250_CI_AS

94 SQL_Slovak_CP1250_CI_AS

96 SQL_Slovenian_CP1250_CI_AS

106 SQL_Latin1_General_CP1251_CI_AS

108 SQL_Ukrainian_CP1251_CI_AS

113 SQL_Latin1_General_CP1253_CS_AS

114 SQL_Latin1_General_CP1253_CI_AS

130 SQL_Latin1_General_CP1254_CI_AS

146 SQL_Latin1_General_CP1256_CI_AS

154 SQL_Latin1_General_CP1257_CI_AS

156 SQL_Estonian_CP1257_CI_AS

158 SQL_Latvian_CP1257_CI_AS

160 SQL_Lithuanian_CP1257_CI_AS

183 SQL_Danish_Pref_CP1_CI_AS

184 SQL_SwedishPhone_Pref_CP1_CI_AS

185 SQL_SwedishStd_Pref_CP1_CI_AS

186 SQL_Icelandic_Pref_CP1_CI_ASTERTIARY_WEIGHTS is intended for use in the definition of a computed column that is defined on the values of a char, varchar, or varchar(max) column. Defining an index on both the computed column and the char, varchar, or varchar(max) column can improve performance when the char, varchar, or varchar(max) column is specified in the ORDER BY clause of a query.

Hmmm.... this does not sound much like any of the tertiary stuff I was talking about. And did you notice how many **CI** (case insensitive) collations are listed there?

Let's try it out:

SELECT TERTIARY_WEIGHTS('ABCDEFGHIJKLMNOPQRSTUVWZYZ' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)

0x0101010101010101010101010101010101010101010101010101

Hmmm... well, it is not using Microsoft's weights (remember our aversion to embeded 0x01 bytes!) but if uppercase with an ignore case flag returns a low number, perhaps there is some common ground.

Let's try the lowercase letters:

SELECT TERTIARY_WEIGHTS('abcdefghijklmnopqrstuvwxyz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)

0x0202020202020202020202020202020202020202020202020202

Hmmm... never mind. :-)

Clearly it is only capturing case differences, and since it will not work with either Unicode types or unusual letters like U+028f (ʏ, a.k.a. LATIN LETTER SMALL CAPITAL Y), all of the numbers will be 0x01 or 0x02. Let's make sure:

SELECT TERTIARY_WEIGHTS('AbCdEfGhIjKlMnOpQeStUvWxYz' COLLATE SQL_Latin1_General_Pref_CP437_CI_AS)

0x0102010201020102010201020102010201020102010201020102

Ok, that looks a bit like proof, right? :-)

So this is a function that will return information about case distinctions even in case insensitive sorts, to help with index performance of non-Unicode SQL compatibility collation columns. Though how it helps is quite unclear at this point.

And to wind down on a weird note, you can try the following query:

SELECT TERTIARY_WEIGHTS('W' COLLATE SQL_Scandinavian_Pref_CP850_CI_AS),

TERTIARY_WEIGHTS('w' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS),

TERTIARY_WEIGHTS('W' COLLATE SQL_SwedishStd_Pref_CP1_CI_AS)

0x03 0x02 0x01

I am afraid to ask what it is about the letter **W** in Scandinavia that would cause TERTIARY_WEIGHTS() to return a 0x03 here.

**Trivia **-- does anyone know how to get a 0x04? Or another byte value higher than that? :-)

Perhaps I am even more afraid of asking how on earth anything one does with this function can help performance. Does anyone have any thoughts here?

I would probably recommend giving this function a miss, in the meantime (a statement I will happily retract if someone can explain a purpose or use for this function that makes sense!).

*This post brought to you by* **ʏ** *(U+028f, a.k.a. LATIN LETTER SMALL CAPITAL Y)*

# **Steve Kass** on 20 Jul 2006 6:12 PM:

Here's a clue for your trivia question: SELECT TERTIARY_WEIGHTS('AaÀàÁáÂâÃãÄäÅå' COLLATE SQL_Latin1_General_CP850_CI_AI)

Steve Kass

Steve Kass

# **Michael S. Kaplan** on 26 Jul 2006 3:25 PM:

Indeed, and that is a great example -- can you explain why? :-)

# **John Cowan** on 6 Dec 2007 12:34 PM:

I'm no SQL Server mavin, but clearly what is special about W in Scandinavia is that it's just a glyphic variant of V, mostly used in proper names nowadays. The old royal dynasty of Sweden (1523-1654) is usually spelled Vasa now, but the rye-crisps are still Wasa.

*referenced by*