Sorting it All Out: Search Archives

Search Sorting it All Out Archive

(or go back to index)

Search in posts:

Regular expressions are allowed, but play nice. Sorted descending by date published.


  1. He had the strength of an OX[IA], I tell you (2007/02/18)

    … EPSILON WITH OXIA and GREEK SMALL LETTER UPSILON WITH OXIA), are not in the Jet, SQL Server, or Windows <= Server 2003 collation tables. Thus they are equal as they have no weight. Their equality has nothing to do with the idea of them both being c…

  2. Why I think System.String.IndexOf(Char) sucks (2007/02/17)

    …Kaplan on 18 Feb 2007 2:41 PM: Well, actually, we use a hybrid approach: 1) For most purposes we use your #1 2) For NLS collation functions that take an LCID, we do #2 plus (we include other constructs like sort elements). # Dean Harding on 18 Feb 2…

  3. Giving a character a new identity (by giving it some secondary weight) (2007/02/17)

    …gh at least one other person also confirmed this to be a "bug", the truth is that it is entirely by design, and the way collation works on Windows and the .NET Framework. The trailing character at the end of the string is U+ff9e, a.k.a. HALFWIDTH KA…

  4. On distinctions that are primarily with [and without] difference (2007/02/15)

    …no one actually complained about the plan or even about the sorts for several years that it was available in the beta). Collation has a lot more to do with user's intuitive expectations than any kind of reasonable argument one might make about the l…

  5. So how does that Naqittaut keyboard work, exactly? (2007/02/04)

    …. There was a real push to support more, but given that the Canadian Syllabics subrange in Unicode had no weight in the collation tables, the results of trying to use it would have been less than optimal. So the keyboard was floated with various inf…

  6. lo lo lo lo lo-LA (2007/02/03)

    …ood job for it was Windows 2000, if you ask me. But Lao (a.k.a. Laotian, a.k.a. ລາວ) did not fare as well. I mean, some collation weights were added so Lao would at least have some weight, unlike some others. But it was about as linguistically meani…

  7. Be careful what you wish for (just in case it comes true!) aka When a Cedilla needs to be a Comma Below (and vice versa) (2007/01/26)

    …uff like this and this may be. Because it affects text processing as a whole. Although with that said, the NLS Romanian collation tables on Vista will create the equivalences so that you will get the right results. Therefore: ș (U+0219, LATIN SMALL …

  8. If I were Australian I might say that Steve Jobs and the Klingon emporer were mates? (2007/01/26)

    …d since Klingon in its own script was rejected (I may have mentioned it in passing here or here), it seems like Klingon collation would probably be distracting.... But in the end, having the Klingon locale on the list of UI languages for a Mac does …

  9. With SQL Server (and SQL itself) comes the illogic of 'trailing spaces' (and the myth of fixed width) (2007/01/15)

    There is a problem with the notion of both trailing spaces and fixed width in SQL Server, when you are using Windows collations. I bring this up since I have a meeting this week where people will be talking about what are the differences between …

  10. The city elders won't give this string weight, either (aka On being consistently dead wrong, aka Ordinal or bust?) (2006/12/26)

    …etween the title of this post and the one it alludes to (this one), the answer lies. These characters have no weight in collation, in either Windows or the .NET Framework, until Vista. Though as a bonus in managed code, if you are running on Vista, …

  11. Open it all up, get out of the way, and then what happens? (2006/12/12)

    … You can remove the OTLS support pieces of the comment and insert proofing tools support or language tagging support or collation support and you will cover the feelings of people when they are told their language can now be supported, which they fi…

  12. SQL and the CLR: Part 1 (the things we can make work well) (2006/12/07)

    …ould start the series at some point. :-) The first and most obvious thing to avoid is any of the SQL Server 2005 *_BIN2 collations. These "new in SQL Server 2005" collations that I have mentioned previously have really only one benefit -- if one is …

  13. SQL and the CLR: Part 0 (2006/12/04)

    …you ask me. I mean, just speaking from the standpoint of internationalization and focusing in particular on locales and collation, there are a ton of things that simply don't match. Like when Aldo Donetti asked me: HI Michael, I’d have to figure out…

  14. So when is Esperanto coming? (2006/11/30)

    … 30 Nov 2006 9:51 AM: "one can get all the language support one desires" Last time I checked fairly obvious things like collation weren't included in this, so it falls a long way short of what most language users would "desire" or really even tolera…

  15. Subsets of subsets of subsets of subsets of subsets (2006/11/30)

    …upports without Windows only locales). There is that weird subset/superset of locales supported by SQL Server for their collation support (subset because they folded many of them together and also because they did not update for Vista or even Server…

  16. About that second presentation I did (2006/11/21)

    …But interestingly, of the two: Keyboards on Win32: Beyond the Keyboard Layout Creator Sorting It All Out: More Words on Collation The keyboard presentation was not my favorite; the one I enjoyed most was the sorting talk. :-) Just before the talk so…

  17. Another satisfied customer! (2006/11/19)

    …ut people using new Vista locales (e.g. Tibetan or Sinhalese) without setting user locale properly and cursing our poor collation support since it was never actually set correctly! So the easier it is for the settings to be right, the better!)…

  18. Even the characters with no weight can be given weight in their own special way (2006/11/19)

    …). Now although these issues are weighed quite heavily, the truth is that all of these six characters have no weight in collation, a factoid that affects all versions of Windows prior to Vista, SQL Server, Access/Jet, and so on. A change here to add…

  19. Korean + French == ???? (2006/11/17)

    …o a CSTR_GREATER_THAN. Can you guess why? Hint #1: They ask me "why is my Korean text in random order?" Hint #2: French collation: When diacritical becomes diabolical Yes, that is right -- On all locales, the circled version of Hangul is considered …

  20. The problem of string comparisons, WORD sorts, and the minus that is treated like the hyphen (2006/11/16)

    …ter reason than ever to keep in mind the importance of linguistically meaningful strings when one is using the NLS/NLS+ collation support? Clearly the intent of the word sort is to handle an entirely different class of usage of U+002d -- basically i…

  21. Maybe it is the name that is 'Undesirable' ? (2006/11/12)

    …ng to be defined by its standards (in that case, no PUA and no unpaired surrogates, in addition to being unknown to the collation data). The same conditions apply to the managed CompareInfo.IsSortable method, by the way.... And then yesterday in Kee…

  22. Some people feel really insecure about the size of their [string] members (2006/11/10)

    …ngle sort element -- the sort of thing that no Win32 function returns directly but indirectly the unmanaged and managed collation functions/methods use and which I have also talked about previously. The simple truth is that I find the blog post that…

  23. Where will Michael be? (2006/11/01)

    …0pm ET) -- at the 30th Internationalization and Unicode Conference in Washington D.C. Sorting It All Out: More Words on Collation -- all new words, many of which focus on Vista and others on interesting linguistic issues! I'll also be doing some boo…

  24. If you are more sensitive, you'll pick up on more problems (2006/10/30)

    …post is not about relationship advice!) Regular reader Dean Harding pointed out a few days ago when I talked about When collations collide?: Yeah, this is a real problem when you're developing an application that can be installed on somebody else's …

  25. SQL Server: compatibility collations vs. Window collations (2006/10/29)

    The other day when I talked about When collations collide, John Ingres commented: We've been looking at the implications of moving our database from SQL_Latin1_General_…

  26. When collations collide? (2006/10/27)

    …are the scalar variable "@column1". Bart was very quick to point out where the issue is documented: See the “Identifier Collation” topic in BOL: The collation of an identifier depends on the level at which it is defined. Identifiers of instance-leve…

  27. When _wcsnicmp can't hack it, CompareStringW delivers (2006/10/23)

    …s string length, ignore case, choose an appropriate locale, and work in different versions of Windows -- the master NLS collation function, CompareStringW! And lstrcmpiW is just a wrapper around CompareStringW anyway, so if you almost liked the beha…

  28. Is Font.FontFamily localized? (2006/10/21)

    …n), but still I would not use that for string comparisons (since there is no guarasnteed match between the name and the collation).…

  29. It's LIFO (last-in, first-out) in Hebrew (2006/10/12)

    … at all, now is it? :-) The answer is pretty much a purely technical one with no real linguistic basis. You see, from a collation standpoint, abcd always comes before abcd_ (where the blank is filled with any letter). And therefore in a language lik…

  30. It's LIFO (last-in, first-out) in Hebrew (2006/10/12)

    … at all, now is it? :-) The answer is pretty much a purely technical one with no real linguistic basis. You see, from a collation standpoint, abcd always comes before abcd_ (where the blank is filled with any letter). And therefore in a language lik…

  31. Who is the Hacek Girl? (2006/10/12)

    …a day, not including spam). Like just the other day, I got a question from someone: Our customer hit an issue in Slovak collation where 'c' and 'č' compared differently with ignore namespace. We found that 'č'is an exception for the Slovak locale th…

  32. Where the hell did Replacement Locales come from? (2006/10/09)

    …ault locale. However, specific groups like ESENT (Jet Blue) and SQL Server and Office had dependencies on items such as collation support and code pages. It was somewhat heartening that most of the items came up in that initial conversation I had wi…

  33. Stuffing the ballot box? (2006/10/08)

    …in links in random email!). There are a whole bunch of tags there, from Int'l Programming to Keyboards to Linguistic to Collation/Casing to Locales/Cultures to Unicode/Standards and maybe others that are there pretty much because of me (and clicking…

  34. Will I C U at the IUC? (Redux) (2006/10/04)

    …:-) The talks I will be doing: Keyboards on Win32: Beyond the Keyboard Layout Creator Sorting It All Out: More Words on Collation Both are talks that have grown out of a combination of info such as the original talks I have done on both subjects (bo…

  35. Wild[card] thing, You make my CHAR sing (2006/10/04)

    …that really possible? Or is this really a pseudo-stealth "feature" created by SQL Server's implementation of linguistic collation that nobody ever noticed before? And either way, how on earth can this be "fixed" in the future, or even documented? I …

  36. What would it mean to internationalize StrCmpLogicalW? (2006/10/02)

    … algorithms (in FindNLSString and/or FindNLSStringEx) might also be important since historically these three aspects of collation have been kept in sync from a functionality standpoint whenever it makes sense. And given the needs of actual applicati…

  37. Why don't all the half forms sort right? (2006/09/25)

    … see three different forms then you can look at that Unicode FAQ link) So that part is easy enough. And one part of the collation story on Windows -- the fact that both ZERO WIDTH NON-JOINER and ZERO WIDTH JOINER both are characters that intentional…

  38. The modern solution to the problem of Traditional Spanish in Vista (2006/09/23)

    …s 2000 beta data was at one point snapshot'ed by Jet 4.0, though the way they and later SQL Server flattened non-unique collations hides the fact that they inherited a Traditional Spanish Mexican sort. Lucky them, lucky us. :-) Now the original chan…

  39. The puLLi suppresses the inherent vowel. Or does it? (2006/09/20)

    …the proper conclusions. Remember when I first was talking about my earlier experience with the Virama and its effect on collation back in April of 2005 when I posted And then there is the virama....? In that post I noted how the expected effect of t…

  40. Put in on my Tab, please (2006/09/19)

    …tically. How does one voice or semi-voice a tab? :-) To some extent you could consider it a side effect of the way that collation is implemented to achieve the results I discussed in Knock knock! Who's there? Kana! Kana Who?, but those voice and sem…

  41. Tamil is an abugida. It is not an abugi-DOH!, if you know what I mean (2006/09/17)

    …e difference between logical order and visual order. There are huge benefits to logical ordering, whether one considers collation or many other kinds of text processing. and certainly when it comes to input, everyone I have talked to so far consider…

  42. Every character has a story #23: U+00ad (SOFT HYPHEN) (2006/09/02)

    …he text stream, will be there even if it is usually invisible and ignored. Now I say usually because in operations like collation on Windows, the SOFT HYPHEN will be ignored (it is given no weight) but it will also break compressions. In practice th…

  43. And if your language starts playing a different TUNE (2006/08/31)

    … Unicode's PUA (Private Use Area) in an attempt to make Tamil into a simple script (rather than a comple one), to build collation support directly into the order of the code points in the encoding, to encourage ISV's like Adobe to support Tamil. The…

  44. If you wanted to get it done with the font... (2006/08/30)

    …xpect to be capitalized not as ΡΎΘΜΙΣΗ ΣΉΜΑΤΟΣ but instead as ΡΥΘΜΙΣΗ ΣΗΜΑΤΟΣ and I pointed out that the NLS casing and collation tables, which pride themselves on features like reversibility and weighed equivalences, simply did not have this type o…

  45. The myth of cross-product compatibility (2006/08/26)

    …mpat, and being weird, I made a statement about the irony of finding a bug that could have been found in many different COLLATIONS supported in SQL Server in a Unicode-only locale and a binary sort -- one of the specific collations that has nothing …

  46. Decimal vs. hexadecimal LCIDs, backcompat, and being weird (2006/08/21)

    … a question which VSTS International Program Manager Aldo Donetti forwarded on to me: I have a test that enumerates all collations in SqlServer and verifies that the string generator can, well, generate strings for each of them. I hit a weird case f…

  47. Sometimes, uppercasing sucks (2006/08/18)

    …eptions to just about every rule you can throw at it, it's got to be named Dutch. Capitalisation, accents, hyphenation, collation, word order, heck the Dutch even violate the metric system with their own definition of pounds and ounces (not compatib…

  48. Did software developers ever learn their ABC's? (2006/08/15)

    …5:23 PM: Pity .NET doesn't support Java's notion of a RuleBasedCollator. That way, if you don't agree with the built-in collation rules, you can actually roll your own.…

  49. Collation != case, still (2006/08/08)

    … simple search for U+00df indicates. And most importantly, since Casing and IgnoreCase are still not the same thing and Collation != Case (a.k.a. Collation <> Case), for now this is how casing will work on Microsoft platforms -- what Unicode refers …

  50. The Cantonese IME (not for input of characters from Canton, Ohio) (2006/07/27)

    …can I say besides welcome to the future. :-) In Vista, from the time when it was just Longhorn, there has been enhanced collation support for all of the CJK locales. The stroke count sorts and Mandarin pronunciation (both Pinyin and Bopomofo) sorts …

  51. We're off on the road to Korea! We certainly do get around... (2006/07/22)

    …one could perhaps argue that the whole LVT -- leading/vowel/trailing -- mechanism used in discussions about Jamo/Hangul collation is an artifact of implementations -- and that the reason that ᄀ (U+1100) and ᆨ (U+11a8) look the same is that they are …

  52. Your data will be released, unharmed (2006/07/19)

    …like å, and ড় can often look like ড়. Normalization does not destroy language through the equivalance, and neither does collation. Both technologies are simply working to make sure that no matter what is the preferred normalization form in which dat…

  53. A new version of Unicode is released, and it's all five by five (2006/07/18)

    …s that are the foundation for all modern computer programs - including the latest data for Unicode security mechanisms, collation, and locales. For the first time, the Unicode Collation Algorithm (UCA) is released in parallel with the UCD - both UCA…

  54. Appreciation, embarrassment, and redirecting thanks (2006/07/17)

    …ook at this post and think that 2 and 2 make 22, rather than 4!).... I answered a few questions that Kieran asked about collation while she was putting the data together for Bengali, and I checked the data in when it was done. I also think I added a…

  55. The fallacy of comparing out of context (2006/07/09)

    …rticle about the subject) Well, this post is not about that. Instead, it is about an analagous practice in the world of collation, one that functions like CompareString can unfortunately support the propogation of. If you call the function and speci…

  56. Feature ideas don't always turn out to be good ones (2006/06/18)

    … good idea after all. As an example, as a part of the work in Vista to support the Unicode 5.0 repetoire in the default collation table, a lot of weights had to be added to many different code points. In one particular case, I was looking at how Uni…

  57. Feature ideas don't always turn out to be good ones (2006/06/18)

    … good idea after all. As an example, as a part of the work in Vista to support the Unicode 5.0 repetoire in the default collation table, a lot of weights had to be added to many different code points. In one particular case, I was looking at how Uni…

  58. Unicode and SQL Collations have nothing to do with each other (2006/06/06)

    …esponded to Alexey Sadomov's query about unusual behavior in the SQL_SwedishStd_Pref_CP1_CI_AS SQL Server compatibility collation where Unicode columns and non-Unicode columns returned different results, and I did have two theories about the behavio…

  59. What's the encoding, again? (2006/06/04)

    …he noticed the other day: Notice on this page http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/nls_Collation.asp that the characters in parentheses are being mis-displayed as windows-1252 pairs, rather than being UTF-8 decoded: …

  60. What the @!#$% is the TERTIARY_WEIGHTS() function for? (2006/06/02)

    …: Returns a binary string of weights for each character in a non-Unicode string expression defined with an SQL tertiary collation. Syntax TERTIARY_WEIGHTS( non_Unicode_character_string_expression ) Arguments non_Unicode_character_string_expression I…

  61. Je, for sure, from Sweden. (2006/06/02)

    …he contact link): Hi, Michael. I've encountered with a problem in sql server related with SQL_SwedishStd_Pref_CP1_CI_AS collation and found your acrticle (http://blogs.msdn.com/michkap/archive/2006/04/25/583307.aspx). The problem is that sql server …

  62. It is only of SECONDARY importance (2006/06/02)

    …n rather short order -- there are a lot of letters. Graham Asher talks about this issue in his document entitled Better Collation Rule Markup: a critique of Locale Definition Markup Language (he points it out for Turkic languages), though he actuall…

  63. Keeping out the undesirables? (2006/05/31)

    … This function takes a string and essentially gives you a judgment about whether this string is one you can pass to the collation functions in the NLS API and expect to have something along the lines of reasonable, supportable results. The process i…

  64. Avoiding SQL Server collation woes in the TempDB (2006/05/30)

    Kimberly L. Tripp has a great post entitled Changing Database Collation and dealing with TempDB Objects. Definitely worth a read -- this is a problem I have bitten by in the past and have bee…

  65. Custom Case Mappings? (2006/05/26)

    …a bit confusing. :-) Clearly the intent is to try to show some of the ways that cultures customize behavior surrounding collation and casing. And the target audience seems to primarily be people who are native English-speaking folk who have not had …

  66. SQL Server's cross-version collation support (2006/05/25)

    …y thought I had the wrong take in posts like this one and this one when I talk about the differences between SQL Server collations. The way he posed the question had to do with the way CompareStringEx, LCMapStringEx, and FindNLSStringEx have a reser…

  67. Is it punctuation, symbol, or diacritic? (2006/05/24)

    …gs with existing APIs? This is an excellent question, one that is deeply embedded in some of the designs of the default collation table, in both Vista and in prior versions of Windows (the latter matches the .NET Framework).... Before we are done we…

  68. 'Which comes first?' vs. 'Are they equal?' (2006/04/29)

    Remember how I have talked in the past about the difference between two different purposes for collation (comparison vs. identity, or alternately the difference between CompareString and EqualString)? (if not you can follow …

  69. The disunification of Norwegian and Danish sorting (2006/04/27)

    …embers? What does it mean for the Norwegian/Danish sort in SQL Server 7.0? What does it mean for the DANISH_NORWEGIAN_* collations in SQL Server 2000 and 2005? What does this mean for Windows Vista? What does it mean for future versions of Access, J…

  70. Why do we call w 'double u' -- doesn't it look more like a 'double v' ? (2006/04/25)

    …andards. (Among other things, he gave us some data about how the Cyrillic script versions of Sami work to help with our collation efforts. I'll talk more about this another day....) One of the interesting things he mentioned was something that Raymo…

  71. If at first you don't succeed, there's probably still a bug (2006/03/30)

    … potentially incorrect issue with the implementation, what can we find out here? Well, if you look at How to track down collation bugs from late last year, you will notice a couple of things: Most of the points in that post do not apply, and Point #…

  72. Only ONE WCHAR per dead key (2006/03/28)

    …vious posts, they're needed for the Bulgarian written language, see: 1.(see the comments) How does Microsoft assign new collation weights? http://blogs.msdn.com/michkap/archive/2005/09/12/463483.aspx 2.(see the whole post) Can I get my characters in…

  73. Is it a bug? (2006/03/27)

    …ore the alphanumeric symbols. Nick Lamb on 27 Mar 2006 9:17 AM: A self-consistent, reproducible result from a localised collation function is a bug if and only if it conflicts with the reasonable expectations of the users of that locale in the conte…

  74. I need my SPACE, symbolically speaking (2006/03/25)

    …not a post about anyone breaking up with me and telling me that they need their space) In Microsoft's implementation of collation, we have several different categories of characters, and rules for dealing with each category. One of the interesting c…

  75. Pretending the vowels aren't there (2006/03/20)

    …) It might be a useful potential addition to consider for TR30, though it is probably better in most cases to handle in collation so that it is not a destrutive operation to the points that are present.... # Ilya Konstantinov on 20 Mar 2006 12:45 PM…

  76. Practical Uses for Replacement Cultures/Locales (2006/03/20)

    …ffirm, yes, MLBT + Vista is good stuff. I will be interested to see what MLBT does to help people understand and create collation rules. # Michael S. Kaplan on 21 Mar 2006 6:24 AM: Sigh, not what I said, Nick. But this is not the sort of feature tha…

  77. Just in case you thought I was blathering. :-) (2006/03/18)

    …of SQL Server 2005 SP1. As mentioned by Bob Myers: The most critical issue is related to report processing and database collation. Model-based queries were incorrectly returning some entity key grouping values that differed only by case. Because cas…

  78. Casing and IgnoreCase are still not the same thing.... (2006/03/15)

    … a few bugs, mainly in documentation, to help add to the confusion.... To start with, as I have pointed out previously, Collation != Case. And more importantly in this case, ignoring case in a linguistic comparison is not exactly the same as either …

  79. Traditional versus modern sorts (2006/03/12)

    Although the model for collation would be simpler if it never changed, the fact is that changes do happen, so it is important to capture that change. I …

  80. Don't forget to test the demos (2006/03/09)

    …e 'MICHKAP3\AdminЀ' already exists in the current database. Basically, SQL Server is validating the entry using its own collation rules -- whether it is doing some sort of optimization here to avoid the performance hit of a call to do the Windows au…

  81. Everybody's doing the wraparound.... (2006/03/07)

    …? I was of course talking about fonts there. This time I am going to take a slightly different approach, and talk about collation. I will give the string, the code points, and the sort key. We'll start simply, with one letter: e U+0065 0e 21 01 01 0…

  82. Collation is slutty (2006/02/28)

    Yes, according to the Slut-O-Meter, 'collation' has a 57.08% sluttiness rating. And that is before THIS post makes it into the index; it may be worse soon.... Sorting…

  83. Where are the IE plain text fonts? (2006/02/19)

    …irst really knowledgable language contact that our (at the time) new locale data owner was able to work with to produce collation results for an Indic language (in this case Malayalam). Believe me, when you are finding your way in a new role, such a…

  84. Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER) (2006/02/15)

    …change the meaning or interpretation of the characters. They are thus supposed to be ignored in things like the Unicode Collation Algorithm and outright stripped in things like StringPrep. The problem is that sometimes they do convey semantic meanin…

  85. Every character has a story #18: U+06cc and U+064a (ARABIC LETTER FARSI YEH and ARABIC LETTER YEH) (2006/02/14)

    …nt, both characters have been in Unicode since at least version 1.1, released in June of 1993. Looking at the Microsoft collation tables, they are close to each other bit not quite equal, something that maybe ought to be changed for the sake of peop…

  86. Collation can actually be linguistic (2006/02/12)

    I pointed out in the post Some sort of order to collation that it is easy to dismiss linguistic issues when one is thinking about collation. As Steven Pinker pointed out in The …

  87. Is RtlCompareUnicodeString used correctly? (2006/02/09)

    …hael S. Kaplan on 9 Feb 2006 8:48 AM: They do work for me, maybe something was down? Well, it is as useful as the *_BIN collations in SQL Server (which actually have the same order!). # Dean Harding on 9 Feb 2006 5:29 PM: Yeah, I would have imagined…

  88. The answer is dependent on the question (aka Turkish Delight) (2006/02/04)

    …e Conference (the one entitled Tales of Incorrect String Comparisons) actually talks about this issue and several other collation-type problems. I would highly recommend it to anyone who finds this type of thing to be interesting. Several cool demos…

  89. We're drowning in LIPs! (2006/02/03)

    …or an extra diacritic showing that there is no vowel). Now as I pointed out in the post Does Bengali sorting work?, the collation support for Bengali is not a part of XP SP2. And Malayalam is in the same boat there (you have to wait for Vista to get…

  90. Another interview question (2006/02/02)

    …e ch in Spanish or dzs in Hungarian -- I have called these "sort elements" as they are treated like a single letter for collation purposes. So it is not just the canonical equivalence cases, which canas you mention be normalized out.... # Michael S.…

  91. Handling multilingual data in SQL Server (2006/01/29)

    … the best match given a search string in some arbitrary language. The column, of course, is Unicode data type with some collation. Is there an optimal default collation? What is recommendation for table design for efficiency -- collation, indexes? A…

  92. What do U+0223 and U+0657 have in common? (2006/01/18)

    …ourse #5 is why they both kind of caught my eye, well that and a sixth and seventh point: Neither one has any weight in collation in any released version of Windows. Both of them have weight in collation in Windows Vista. Vista keeps looking better …

  93. Falling over the edge of a conceptual collation cliff (2006/01/15)

    With a blog title like Sorting It All Out I make no bones about my somewhat obsessive interest in collation. Well, this is unapologetically one of those interesting collation posts. :-) You may recall if you are a regular reade…

  94. Collation != Case (a.k.a. Collation <> Case) (2006/01/11)

    …Extended block (U+1F00 -- U+1FFB). This result is the same irrespective of whether the accent-sensitive or -insensitive collation is used. It also (only tested on SQL Server 2000) gets the same results for Greek, Cyrillic_General, Ukrainian, and Mac…

  95. 'Acceptable' Japanese sort order? (2006/01/03)

    …eferenced -- they solve the multiple pronunciations issue by simply including all the pronunciations, but is harder for collations since the item can only be sorted in one place.... # Rosyna on 4 Jan 2006 6:22 PM: Ah yes. Those hosers do have it eas…

  96. Sorting multilingual data (2006/01/01)

    …o so now.... The rule is simple -- the person who is looking at the data has a specific expectation, and it is a single collation, the one that they know. They do not expect data in a different language to sort like a user of that other language wou…

  97. Popularity hurts objectivity (2005/12/31)

    …ing potential limitations/issues/bugs in language/locale-specific formatting, keyboards, locale data, calendars, and/or collation for Georgian, Armenian, Latvian, Japanese, Korean, Macedonian, and others. And although I am not a linguist, I do have …

  98. What's a secondary distinction? (2005/12/29)

    …ar ago that I talked about how They ask me "why is my Korean text in random order?" It is a pretty important concept in collation to have items collate with muliple levels. What is interesting to about this concept is how it is so hard to describe t…

  99. What Unicode version do you support? (2005/12/23)

    … very different shipping cycles). If they are looking for UCA support, the answer is Microsoft does not use the Unicode Collation Algorithm so they are definitely asking the wrong question (though they have made some changes to be a little more like…

  100. New in Windows Vista: OrdinalIgnoreCase for Win32 (2005/12/22)

    …hat we can get the answer for Windows and the .NET Framework, all we need is a faster way to make the SQL Server binary collations have this admittedly un-natural comaprison behavior as well! This post brought to you by "Ʃ" (U+01a9, a.k.a. LATIN CAP…

‹‹ Previous Showing results 201—300. Next ››