Sorting it All Out: Search Archives

Search Sorting it All Out Archive

(or go back to index)

Search in posts:

Regular expressions are allowed, but play nice. Sorted descending by date published.


  1. Backcompat is the father of the NLS APIs (2005/03/06)

    …e and hopefully the former is often. But in any case in order to not break clients who have to depend on sort keys, the collation version APIs (IsNLSDefinedString and GetNLSVersion) were added in Windows Server 2003. In that way any client can use a…

  2. "POSIX" style locale support on Windows? (2005/03/05)

    …es that want to do things like SELECT * FROM tab ORDER BY french_str, english_str where french_str might have a default collation order of fr_FR and english_str might have a default collcation order of en_US. The overhead of repeatedly calling setlo…

  3. An interview question (by popular demand) (2005/02/28)

    …then make it progressively more complex: So for example, I tell them about what I do and about how one my foci is about collation. I then have them write a simple string comparison function. Take two strings and return -1, 0, or 1 depending on wheth…

  4. Tech Ed 2005 in Orlando? Moi? Perhaps.... (2005/02/25)

    …d best practices to exploit them. Databases for the World: Optimizing Search in Multilingual Data Sets Using SQL Server Collation for Sorting and Indexing Track(s): Database Administration Many SQL customers have poor understanding of how collation

  5. Theory vs. practice for Korean text collation (2005/02/25)

    …d "Old Hangul", something that products like Windows 2000 and SQL Server 2000 support. And then starting in Windows XP, collation of Old Hangul as a scenario is supported, which is interesting since of course (as I said before) we do not yet have a …

  6. GetHashCode vs. Compare vs. SortKey (2005/02/18)

    … But the number of actual dupliates is probably pretty small, even as compared to strings that would be "duplicates" by collation criteria (different normalized form, different case in a case insensitive situation, etc.). Plus I don't think the Comp…

  7. Can a codepage be changed? How about which codepage a locale points to? (2005/02/06)

    …wanathan on 27 Aug 2005 10:57 PM: Back in May of 2004, Quan Nguyen sent a message to Dr. International about Vietnamese collation... Yuhong Bao on 19 Feb 2011 12:57 AM: "The 'IBM" method" IBM's codepages are much more precise than MS's codepages. Fo…

  8. You can't ignore diacritics when a language does not give them diacritic weight (2005/02/05)

    This post is about a not entirely intuitive fact that will be seen in the implementation of collation in Microsoft products. It affects the results of both CompareString and LCMapString in Windows, the results of using th…

  9. GEOID -- The LCIDs maligned little brother.... (2005/02/04)

    … my locale? Well, which locale do you mean?" the default user locale is meant to impart the user's choice in format and collation preferences, NOT their location. So IE should be using the GEO settings for that! Some will freak out at this point -- …

  10. What makes a string meaningful? (2005/02/03)

    …at have little clear semantic or pragmatic meaning and focus on the strings that have code points not defined by the MS collation tables (sometimes not even by Unicode!) as discussed in The jury will give this string no weight. Some developers might…

  11. CompareString prefers meaningful strings (2005/02/02)

    …... A < B B < C C < A or even: A < B B < A I will admit that both are not so great. But you have to understand how t­he collation data is created and what it represents. The goal is to give a way to sort every part of the Unicode ­BMP (basic multili…

  12. SQL Server has its own version of .NET "ordinal" comparisons (2005/01/23)

    …son in functions like strcmp and wcscmp -- basically a binary comparison. SQL Server has its own version of this in its collation mechanisms, known as Binary collations. Of the three, I have to say SQL Server has the best name. So lets give them 10 …

  13. Not all characters are created equal: take SYMBOLS, for example (2005/01/19)

    Although collation on Windows gives a weight to every single code point1, there are times that this does not really have an intuitive mean…

  14. The jury will give this string no weight (2005/01/18)

    (the title was inspired by a decade and a half of Law & Order on NBC, then A&E, and now TNT!) I don't want to knock collation on Windows, because I think it rocks. It covers a lot of territory, and it gets the job done (and done well) in a lot o…

  15. Every character has a story #3: U+0138 (LATIN SMALL LETTER KRA) (2005/01/12)

    …d be wrong here, too). But let us get back to LATIN SMALL LETTER KRA. The next odd thing about is where it is placed in collation. Ken Whistler of Sybase pointed out the reason for this back in 2002 on the Unicode List in response to a question from…

  16. International Domain Names? The sign on the door says 'Gone Phishing'.... (2005/01/11)

    …at it sorts after the letter "z". Which I guess is a pretty big clue for someone who spends as much time thinking about collation as I do, but for normal people may only possibly reach the level of a call to PSS. Damn.) It made me think of a worry t…

  17. Encodings in Strings are Evil Things (Part 7.1) (2005/01/10)

    …05 3:35 PM: Hrm. One of my planned hedges was to provide a set of predicates for std::sort that would implement various collation algorithms (starting with a functor that just called your Win32 functions with a specific locale) -- perhaps a likewise…

  18. What is up with number sorting? (2005/01/05)

    …about unrelated issues with GDI and rendering based on native digits in a locale, but if so that has nothing to do with collation. # Larry Osterman on 10 Jan 2005 2:41 PM: Btw, Valorie's been on me for literally years now about the fact that OE does…

  19. How do sort keys work? (2004/12/30)

    … looking at direct string comparison versus sort key calculation/comparison was like looking at the "retail" version of collation vs. the "Wholesale" one. Only some people in the crowd felt it was an illuminating analogy, and I once again learned th…

  20. Comparison confusion: INVARIANT vs. ORDINAL (2004/12/29)

    …dded and an Ordinal member was added to the CompareOptions enumeration. Selecting it would ignore all of those cultural collation features and give you a binary sort that would also, incidentally, not vary. The only remaining problem at this point i…

  21. What is a neutral culture? What is a neutral locale? (2004/12/29)

    …ers who can be as cynical as I am!). Unfortunately, that weird LCID-esque fallback behavior still basically happens for collation and encoding via the culture's CompareInfo and TextInfo objects. Which is not such a big deal, and it is really necessa…

  22. Is it Hangul? or Hangeul? or Han'gŭl? or what? (2004/12/26)

    …s solution in Windows or the .NET Framework since there are no API that would make good use of transliterated forms and collation is not really set up to support it either. A final piece of the puzzle is what happens in North Korea. Essentially, the…

  23. An 'Access Hero', sorta (2004/12/21)

    …hese days, I write code in C, C++, and C#. But I still use Access to hold the half-gigabyte source data that we use for collation (there is no databse tool for ease of transportability between machines, working offline, and querying in the world, an…

  24. IMEs? They have it easy.... (2004/12/20)

    …choice and it will be entered into the application. And here is where we get to the part that makes me (as the owner of collation support in Windows an the .NET Framework) jealous, that makes me say the IME folks have it easy. Because if that first …

  25. IMEs? They have it easy.... (2004/12/20)

    …choice and it will be entered into the application. And here is where we get to the part that makes me (as the owner of collation support in Windows an the .NET Framework) jealous, that makes me say the IME folks have it easy. Because if that first …

  26. Putting your ducks in a row (2004/12/18)

    Collation fascinates me. It has fascinated me for most of the last eight years, since the first time I saw Appendix D of the firs…

  27. Whats up with the Korean (Unicode) sort? (2004/12/14)

    …rean Unicode sort embarrassing? I could not find it defined anywhere, except maybe I found a vague hint to the 'Unicode collation' setting that was used in SQL Server 7.0, which could be Korean. Is that it? Me: No, that's not what it is. Though SQL …

  28. The letters in a language.... (2004/12/10)

    …e english language or not? # Michael Kaplan on 18 Dec 2004 11:08 AM: Hey Jonathan! This is in fact the main reason that collation works the way it does on Windows -- it sorts all of Unicode according to a particular language. It makes the grouping m…

  29. Where is the locale? "Its Invariant." In where? (2004/12/08)

    …QL Server, or any of the other products that list locales for customers. So why is it there? Well, it all comes back to collation. Like everything else that is worthwhile in life. :-) In collation for Windows, there is a default table that gives the…

  30. They ask me "why is my Korean text in random order?" (2004/11/30)

    … Hanja do not have the notion of non-spacing characters (like ˆˇˉ˘˙˚˛˜˝ diacritics seen in Latin), so that part of the "collation weight space" is used for Hanja that most commonly have the same Hangul pronunciation. Telling the API or method to ign…

  31. Microsoft does not use the Unicode Collation Algorithm (2004/11/28)

    …owhere is this greater that the subject of this posting -- people who wonder why Microsoft does not support the Unicode Collation Algorithm. People notice that Windows seems to have a similar framework and they assume that both of them use the same …

‹‹ Previous Showing results 401—431.