Sorting it All Out: Search Archives

Search Sorting it All Out Archive

(or go back to index)

Search in posts:

Regular expressions are allowed, but play nice. Sorted descending by date published.


  1. Some things were -SO- much more impressive to know before Google made them -SO- easy to learn (2015/08/28)

    Technically the same point applies to Bing, but seriously, who are we kidding here?s filed under Collation/Casing because if Google can co-opt my passion about collation with their Alphabet re-org, then clearly their passion i…

  2. Microsoft's normalization efforts predate Unicode's (2015/03/10)

    …han choosing a normalization form? Charles Well, the easy answer is that Microsoft's interest in both normalization and collation predates Unicode's, so real world backwards compatibility trumps theoretical conformance with an industrial standard li…

  3. Every rule has an exception that proves the rule, even in Unicode!!! (2013/11/18)

    …supported in Uniscribe and OpenType (until it was intentionally removed!) and which I think might still be supported in collation and which was never supported in the IME. And encoding #5 is what is currently supported in all the support Microsoft p…

  4. Everyone: Repeat after me... We need both cases of Cherokee *in* the Unicode BMP! (2013/11/05)

    …oposal, at least) in the SMP, rather than the BMP, which is where the current letters are). This is a mild nuisance for collation, since it will only work for the Cherokee user locale, and it will take an extra half day for the owning team to add po…

  5. This blog is about a love story. And a breakup. With Dell... (2013/08/20)

    … contract PM on a Dell Latitude over 15 years ago - my first feature PM role ever! And a lot of code and data involving collation. And hundreds of thousands of data updates and fixes involving locales. And almost every keyboard layout added to Windo…

  6. And at least 38.2% of Montenegrins can proudly cry "the version after next of Windows was my idea, and it's all about ME!" (2013/04/15)

    …eaking is Montenegrin (as opposed to the 41.6% who say it is Serbian that people are speaking!). The letters are: Latin collation orderLatinABCČĆDDžĐEFGHIJKLLjMNNjOPRSŠŚTUVZŽŹ CyrillicАБЦЧЋДЏЂЕФГХИЈКЛЉМНЊОПРСШС́ТУВЗЖЗ́ Cyrillic collation orderCyrillicА…

  7. Announcing The Unicode Standard, Version 6.2 (and a little about what MS is doing about it) (2012/09/27)

    …pheme cluster behaviors. The script categorizations for some characters are improved and better documented. The Unicode Collation Algorithm has been greatly enhanced for Version 6.2, with a major overhaul of its documentation. There have also been s…

  8. Moving sucks, something it has in common with my new mail slot. And my new office. (2012/09/26)

    …new mail slot. For reasons that surpass understanding, they order the mail slots by office number! Talk about misuse of collation! It makes no sense. I mean, the mail room puts a sticker on everything that they deliver. The sticker has two things on…

  9. Inconsistencies aren't as important when they're irrelevant (2012/09/11)

    …thms for using, displaying, sorting, and storing them. Though there were simpler times, too. I mean, before the Unicode Collation Algorithm defined in UTS #10, no way to sort the character was defined. Every company had their own way to do it themse…

  10. if you see a ZWNBSP in the Release Preview, don't be insensitive and comment it hasn't been eating enough lately! (2012/07/16)

    …hey really claim that they needed their space?. It works like this: The space character, U+0020, is given weight in the collation table. For that matter, so do U+205f (MEDIUM MATHEMATICAL SPACE), U+3000 (IDEOGRAPHIC SPACE), U+1680 (OGHAM SPACE MARK)…

  11. if you see a ZWNBSP in the Release Preview, don't be insensitive and comment it hasn't been eating enough lately! (2012/07/16)

    …hey really claim that they needed their space?. It works like this: The space character, U+0020, is given weight in the collation table. For that matter, so do U+205f (MEDIUM MATHEMATICAL SPACE), U+3000 (IDEOGRAPHIC SPACE), U+1680 (OGHAM SPACE MARK)…

  12. I know I said 'µ' but I didn't really mean 'µ'. I meant 'μ', you know? (2012/04/25)

    …s, which .NET uses by default, will uppercase convert U+00b5 to Μ, aka U+039c aka GREEK CAPITAL LETTER MU. However, the collation tables tell a different story¹, so the three characters are not as interchangeable as one might want: 0x00b5 10 11 2 2 …

  13. I know I said 'µ' but I didn't really mean 'µ'. I meant 'μ', you know? (2012/04/25)

    …s, which .NET uses by default, will uppercase convert U+00b5 to Μ, aka U+039c aka GREEK CAPITAL LETTER MU. However, the collation tables tell a different story¹, so the three characters are not as interchangeable as one might want: 0x00b5 10 11 2 2 …

  14. The oft-repeated 'What version of Unicode do we/will me support?' question, Redux (2012/02/06)

    …ed products, obviously. Though looking just as the Developer Preview of Windows 8, character property and rendering and collation and font and keyboard and algorithm support seem to be aimed primarily at Unicode 6.0. There is obviously time prior to…

  15. Changing the world, 0.1 steps at a time! (2012/02/01)

    …ny with the Unicode Standard, and have updates for Version 6.1. These will be finalized in February: * UTS #10, Unicode Collation Algorithm * UTS #46, Unicode IDNA Compatibility Processing * You can check it out at the Unicode website, here! The res…

  16. Sometimes things are extended in the wrong direction.... (2012/01/31)

    SQL Server's code page, collation, casing, locale, and resource model are all direct attempts to extend the things that Windows provides in ways that mak…

  17. SharePoint and CJK Extensions A, B, C, D, and even E? (2011/12/12)

    So, the question I got the other day was: We are setting up SharePoint and want to know what collation to use. What support does SQL Server have for CJK Extensions A/B/C/D? Now that's an interesting question. If you think …

  18. The evolving Story of Locale Support, part 12: Logic dictates that we keep a sense of proportion about the RATIO (2011/12/09)

    …-- you can look in earlier versions of Windows to see the cost of getting the right glyph in the wrong font! Now from a collation standpoint, they are two different characters, and always have been. I had somebody ask me about that the other day, an…

  19. GetHashCode vs. Compare vs. SortKey (2011/12/01)

    … But the number of actual dupliates is probably pretty small, even as compared to strings that would be "duplicates" by collation criteria (different normalized form, different case in a case insensitive situation, etc.). Plus I don't think the Comp…

  20. No one ever doubts me when I tell them SharePoint ate my spec! (2011/12/01)

    …ague of mine from the outside who was talking about SharePoint. It actually started with a question -- a question about collations. She figured I might know something about them.... Now since SharePoint is hosted by SQL Server, I was able to help he…

  21. The evolving Story of Locale Support, part 8: [Finally] taking care of some [more] languages in Pakistan (2011/11/15)

    …ort order from the default table, intended primarily for the Arabic language. Even though we knew there was a different collation. The white paper had a purer intent than that: a love of language, and a desire to see the right thing done with the la…

  22. The evolving Story of Locale Support, part 0 (The introduction) (2011/10/31)

    …d World-Ready features, but my view of some of the basic facets of locales and keyboards and formatting and parsing and collation does not allow me to re-purpose the way I talk about these things. It may be a lack of creativity on my part, but it wo…

  23. Every rose has it's Þ.... (2011/10/13)

    …t of context and so on more than half a decade ago). In Vista, bringing FindNLSString brought the full power of Windows collation to the Find/Replace capabilities of Notepad. So all of the various Unicode canonical forms will always be equal and so …

  24. You don't have to hang out with Mary-Kate/Ashley to feel exposed by an Olson connection! (2011/10/10)

    …le in the industry tend to bemoan a common pattern whereby we at Microsoft always have to do our own thing. We do it in collation. We do it in locales. We do it in time zones. For the most part, the reason we do or own thing is that there is nothing…

  25. Locale vs. UI Language, and the downside of flexibility (2011/09/30)

    …format/collate information (SQL Server works under slightly different core principles -- with the split being codepages/collation vs. UI language/formatting). You can probably see some of the confusion in the question relating to what is a part of t…

  26. The SQL app that works fine until you have to support Chinese.... (2011/08/25)

    … "?". For now our SQL server version is US running under Windows server 2008 US. We use default SQL server settings for Collation. As I have never work with Chinese characters yet in SQL I have no idea how to set SQL server properly. Hope you could …

  27. I've got your number! Here's how... (2011/08/22)

    …r Database.... You still have to divine your own "interlaced differentiation", of course. And the really good news? The collation support on Windows already does the work to collate them together! Of course if you aren't using Windows, you can't get…

  28. Once more into the UCS-2 breach, SQL Server marches (2011/07/01)

    … Server that readily support the semantic of supplementary characters as specific linguistic elements.. Like the *_BIN2 collations I mentioned in SQL Server has its own version of .NET "ordinal" comparisons and elsewhere. And in several of the colla…

  29. There's no "I" in IDN, part 5: Stephen Colbert's job is not in any jeopardy (2011/06/29)

    … finger for ignoring the most important distinction in this case -- the implicit Width Insensitive nature of all _C*_A* collations in SQL Server, which could have been simulated by adding a StringComparison.IgnoreWidth to the first call, had their n…

  30. Whither アイヌ・イタㇰ (Ainu)? (2011/06/07)

    … The BCP-47 tag for Ainu in this case would likely be ain-JP, and a custom locale could be created for it quite easily. Collation may already work (Kana works in all locales -- and the Katakana Phonetic Extensions are covered with the rest of the Ka…

  31. Every character has a story #33: GREEK ANO TELEIA (U+0387) (2011/05/20)

    …tations must often make assumptions that may be invalid for some text. At Microsoft, it has some interesting issues: In collation: It was not even in the collation tables at all until Vista (meaning it had no eight an coudln't be found), and once th…

  32. One of my colleagues is the "Pseudo Man" (a rich source of puns in conversation!) (2011/04/11)

    …or Office". They built it at atop Turkmen (Turkmenistan), aka tk-TM. This led to some interesting consequences with the collation of strings due to the way Turkmen sorting works -- a fact that was reported as a bug several times during both Vista an…

  33. Rupee! Rupee! Let down your CHAR! (2011/03/22)

    …appy people were looking!). Now there are a few things that weren't updated -- like the data behind GetStringTypeW, the collation data, the name that would show up in Character Map, and so forth -- there is really only so much that can be done in be…

  34. Are they different? It's all Greek to me! (2011/03/01)

    …at they actually represent slightly different traditions of showing the same Polytonic Greek text, and when it comes to collation, the benefit of finding the different traditions when looking for Greek words that use the different traditions is very…

  35. the importance of symbols and punctuation (2011/02/17)

    …ay be interesting or important. We could all do with a little bit less of that. It's one of the reasons the support for collation in Windows is so unsuited for search - because its only choices are to only see with the distinctions or to ignore all …

  36. Māori is a language from one of my favorite places in the world! (2011/01/26)

    …al, and “wh” which is nowadays mostly pronounced like an English /f/. MICROSOFT-SPECIFIC INFORMATION: The Windows Māori collation tables include not only the commonly used ā ē ī ō ū, but also the less commonly used alternatives (à è ì ò ù and ä ë ï …

  37. I agree with you 100%. But we're both wrong (according to the spec) (2010/12/22)

    …lightly different reasons. :-) Michael S. Kaplan on 23 Dec 2010 4:34 AM: Also, Java does not give U+fffd zero weight in collation, so you have to do the last two tests with some other character that Java treats like nothing is there.... Cheong on 27…

  38. You can't ignore crap and hope it won't cause problems... (2010/12/16)

    …ation team. BUT.... And this is a huge BUT, in my opinion! If you look at what Unicode does here in its UTS 10: Unicode Collation Algorithm, in its allkeys.txt:: 0000 ; [.0000.0000.0000.0000] # [0000] NULL (in 6429) 0001 ; [.0000.0000.0000.0000] # […

  39. I swear the Romanian bug is fixed; it was fixed 4.5 years ago! (2010/12/13)

    …ters. EVER. There are updates to fonts to display them, you can use MSKLC to create keyboards for them. But if you want collation and casing to work, you have to either: settle for the cedilla-below letters, or upgrade to Vista or Server 2008 or Win…

  40. Perhaps not evil, but certainly getting hella snarky (2010/11/23)

    … of the typical examples in this space where Microsoft does one thing and the industry does something else: Microsoft's collation support vs. the Unicode Collation Algorithm; Microsoft's time zone support vs. the Olsen data; Microsoft's locale suppo…

  41. I [will have] told you so! Well, perhaps too late (all things considered)... (2010/11/09)

    …. The Blog you are reading now had just a few blogs in it. And I wrote a blog titled Microsoft does not use the Unicode Collation Algorithm. The year was 2008. Thousands of blogs had been added to thie Blog since that earlier blog. And I wrote a blo…

  42. I can't see Nunavut from my house, either; but they can see the Inuktitut LIP from their computer! (2010/10/13)

    … nonsense see here for example). Despite the similarities between Inuktitut and Greenlandic, they do not share a common collation. In the very first Inuktitut (Latin) locale in Windows, a version of collation that matched the Inuktitut syllabary was…

  43. Unicode 6.0.0 is [virtually] released! (2010/10/12)

    …html. Have at it.... :-) But note the information I started this blog with: for new characters and scripts, there is no collation or font or property support yet. We don't have the ability to predict the future! Pavanaja U B on 12 Oct 2010 7:25 AM: …

  44. Unicode 6.0.0 is [virtually] released! (2010/10/12)

    …html. Have at it.... :-) But note the information I started this blog with: for new characters and scripts, there is no collation or font or property support yet. We don't have the ability to predict the future! comments not archived…

  45. Kurdish with an Arabic Accent? (2010/10/01)

    …bic with a Kurdish Accent?. However, this blog has nothing to with accent and is really focused on issues of script and collation.. It was a litte over 16 months ago in my The Whey doesn't get a locale, either blog that I responded to the It is Time…

  46. If case conversion were harder, people would do it less (2010/09/26)

    It was just last night that I got the following mail: Hi. I have seen your blogs regarding invariant culture, collation, etc and they are very, very useful. Thanks! Also, if you don’t mind, I do have a question :-). Do you know why the res…

  47. Olive, the other reindeer, gets to Sort it all Out too.... (2010/09/13)

    …andards. (Among other things, he gave us some data about how the Cyrillic script versions of Sami work to help with our collation efforts. I'll talk more about this another day....) It indirectly has to do with a particular issue covered in that A&P…

  48. Refusing to ignore some particular character's width isn't [always] an act of discrimination… (2010/09/07)

    …death, practically. I felt like I had to, really. Because this feature (the WORD sort) is the default in the linguistic collation functions, but is not very well understood. By anyone. Now another point I have made in the past like in #11 of How to …

  49. On Bengali sorting (where an old part of my personal life mirrors an even older bug in Windows) (2010/09/02)

    …the correct language; And still other people expect the text to be sorted correctly in Access or SQL Server even if the collation was not set correctly. Now with both examples clearly in Excel, I assumed that problem #2 was involved, or ma…

  50. It would be like spelling it Anerica or something. (2010/08/17)

    Now in the past I have talked about Microsoft's relationship with the Unicode Collation Algorithm, in blogs such as: Microsoft does not use the Unicode Collation Algorithm in 2004 Microsoft still does not us…

  51. I swear the Latvian bug is fixed; it was fixed 4.5 years ago! (2010/07/13)

    … happening for quite literally years. People were complaining about it for years, I mean. It was to do with the Latvian collation. In Windows, in SQL Server, in .Net, in Office applications in general, in Access in particular. Which basically means …

  52. Kazakh it to me, aka On being small and unique (2010/06/29)

    … about behavior. In sorting, not my personal behavior! As I have mentioned, most people forget that I haven’t owned the collation code in Windows for years now. Though I suppose the fact that I keep answering the questions anyway kind of means it’s …

  53. Call it Reversible Error, aka Yes it has no weight; it was supposed to have no weight! (2010/06/11)

    .Net globalization's collation lives in an unusual world. It was originally architected by a developer in Windows based on the detailed design doc wri…

  54. WORD SORT...Why'd it have to be...WORD SORT? (2010/06/10)

    … give one the choice between word sorts and string sorts -- it is always a word sort, except when it is a _BIN or _BIN2 collation, in which case the hyphen and such will be in a string sort-ish kind of situation (along with everything else that chan…

  55. Vietnam or Viet Nam or Việt Nam or ??? (2010/06/04)

    …rticular they mean good rendering support of he text, with appropriate diacritics and tone marks. They didn't find full collation support (across all Unicode normalization forms) to be as important though they were amused at some of my previous cove…

  56. The report of the need for a Uyghur hotfix may be an overstatement (2010/05/28)

    …ng a new keyboard with the correct letter. Not perfect to my way of thinking, mind you, since they chose not to add the collation equivalencies I suggested. But they may have actually considered that and gotten feedback to not do it. So that may hav…

  57. The inappropriate nature of getting the Feh out of Uighur, Windows 7 edition (2010/05/25)

    …k like: And there you have it, they fixed it. They didn't take my advice about the equating those two characters in the collation tables, to help with the migration of any data that might have been typed by anyone in Vista (or from anyone who upgrad…

  58. I can't actually see Chennai (or any other city in Tamil Nadu) from *my* house....so let's talk LIPs! (2010/05/09)

    …part of the Gov team. My suggestions were- Make all products open architecture wherein end-user can create his own LIP, collation table, etc. and use them. This concept is prevelant in Linux Regards, Pavanaja Jan Kučera on 13 May 2010 1:43 PM: ...I …

  59. Dude! Not so Lao'd! (2010/05/06)

    …ect here. Perhaps I should give a little background, though. If you go back to almost the very beginning of the Unicode Collation Algorithm until version TR10-11 released in January of 2004 around the time of Unicode 4.0, it contained text like the …

  60. [Unicode Announcement] Call for Participation: IUC 34, Oct 18-20, 2010 (2010/04/30)

    …so you know.... I might see if I can submit a talk or three myself, perhaps on some of the things I am doing related to collation, or language, or Tamil, or Bidi, or localizability, or keyboards, or one of the many other things I find myself doing t…

  61. "Does my buttload look too big for that stream?" (from the Tales of the "That's what she said!" files) (2010/04/28)

    …orces outside of us to see the work done. Forces inside Tibet, for instance.... Note that the current implementation of collation in Windows does not allow a compression (i.e. a UCA contraction) of more than eight UTF-16 code units, which means that…

  62. Look out Maharashtra, the Marathi LIP is now available! (2010/04/26)

    …ns done by Cathy or me or both of us: But even though we both cited this difference, it was not captured in the Windows collation tables until Vista (in all prior versions the Indic tables were always combined and only Hindi amnong the Devanagari sc…

  63. You can't get this particular bit of proverbial toothpaste back into the tube (2010/04/20)

    …icrosoft claims to support Unicode, how can it not put the equivalence between Unicode Normalization Forms C and D into collation for Korean? At first I was not sure how to respond. I mean, I feel like I had answered this question before in blogs li…

  64. Sing. Sing a song. Sing it Lao'd (just in case the sort's still wrong!) (2010/04/17)

    …-- it may also be right under the "hundreds of wrong answers that give you a right behavior" principle that table based collations can sometimes bring to the mix, which are unsatisfying to linguists since hundreds of wrongs shouldn't make a right, b…

  65. OS level support for non-Gregorian calendars? (2010/03/31)

    Regular reader and collation hero of days past Santhosh Pillai asked over in the Suggestion Box: Question: Does Windows 7 provide OS level support f…

  66. Learning to spell in Bengali (when one has a cool input method) (2010/03/23)

    …anguage created using Open Type 'ligq' features. Otherwise, it gracefully falls back to Latin-1, which is readable too. Collation happens easily (I extended JavaScript sort() for it). This is a much simpler solution than the struggle we are going th…

  67. Thus the problems resist solution, and the workarounds are often inadequate (2010/03/15)

    …compressions -- Hungarian goulash? and Hungarian is even more complicated than I thought). I mean, the need to consider collation differently depending in whether one is asking the which comes first? question or the are they equal? question is a har…

  68. Coloring outside the lines in the a-ness of the Hungarian Technical Sort (2010/03/09)

    … Hungarian double compressions, either The uppercase letters come before the lowercase ones, unlike most other language collations on Microsoft products But there is a fourth characteristic that I did not mention: All of the letters with diacritics …

  69. Burn Windows Burn (aka If we want to unsay *this* one, we cannot say "Mu") (2010/03/06)

    …ally what is happening is that the letter Μ (under Vista) is being sorted *after* the small letter μ Now all linguistic collations on Windows and .Net other than the Hungarian technical sort (described in Technically it *is* a hungarian sort) sort w…

  70. The inappropriate nature of getting the Feh out of Uighur, Windows 7 edition (2010/02/25)

    …k like: And there you have it, they fixed it. They didn't take my advice about the equating those two characters in the collation tables, to help with the migration of any data that might have been typed by anyone in Vista (or from anyone who upgrad…

  71. Knock knock! Who's there? Kana! Kana Who? I Kana got something wrong! (2010/02/17)

    …his blog involved a huge discussion with a lot of different people about the terminological, technical, linguistic, and collationary features of Kana in Japanese, and was finally reviewed by people both here and in Japan. Unfortunately, it was kinda…

  72. The real problem(s) with all of these console "fallback" discussions (2010/02/15)

    …l en-us fallback is such a bad match for the rest of the world with different decimal separators/day-month order issues/collation support/etc. that you will likely create as much confusion without the question marks as with them. Perhaps this the fo…

  73. The two most important components of letting go and moving on are (1) letting go and (2) moving on (2009/11/07)

    It was Friday afternoon when Santhosh (Santhosh Pillai, aka THE Santhosh, the guy who helped us with the collation story for Malayalam way back when) was asking a question. The question was: Hi: Is there an updated version of this pag…

  74. What is impossible for Microsoft can be simply undesireable for Unicode (2009/09/14)

    Sometimes an implementation makes a certain feature impossible. Like the way Microsoft does collation, in particular the way its DEFAULT table is implemented (a flat DWORD table for everything 0x0000 to 0xFFFF) means that…

  75. Turkish Character in Directory Name Hung Windows NT 3.5, aka small enough to be internationally stupid (2009/08/12)

    …n the KB article). Then there was another one around the Windows XP timeframe. The Turkish casing results were added to collation and Internet Explorer was broken in some horrendous way. Well, not horrendous as compared with being unable to boot Win…

  76. Every character has a story #33: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 2) (2009/07/29)

    …verage user kniws there us no such letter anyway. Never mind, the lack of a way to input the letter is okay.... Part 2: COLLATION Well, if you go back to blogs like Dere are qvestions? In zat case... and the one that started it all What the %#$* is …

  77. Every character has a story #32: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 1) (2009/07/28)

    … 0x10280 Carian 0x102A0 Lydian 0x10920 Mahjong Tiles 0x1F000 Domino Tiles 0x1F030 But what's more important is that the collation algorithms seem to process "ẞ" right. At least in explorer with filenames. Regards, Peter # Michael S. Kaplan on 28 Jul…

  78. In search of the Swedish Tipping Point.... (2009/02/18)

    …e a 'double v'?, where I talked about the Swedish Academy's change to the way the letters W and V were to be handled in collation, and the impact on Microsoft software when this change eventually makes it to the point where it needs to be integrated…

  79. The road to hell is paved with attempts at being compatible (2009/02/04)

    In one of the very first blogs I wrote, I pointed out that Microsoft does not use the Unicode Collation Algorithm. Believe it or not, at the time some people actually asked me whether I thought I might get in trouble for th…

  80. Where's Waldo^H^H^H^H^HMichael? (aka It probably wasn't worth the wait and yet you waited!) (2009/02/02)

    …st.... Third of all, this Blog has a flaw in it. It starts with the name. Sorting it all Out. Now taking my interest in collation and moving it to a multi-level pun is to my way of thinking kind of cool. Not as cool in my opinion as the WinFS Team B…

  81. Sorting "Collate" all out (2008/12/06)

    It was recently suggested to me that the terms COLLATION and COLLATE are confusing to people. Not people over here, I'd guess. And probably not so much the people who work in S…

  82. "If you haven't had sex is six months..." the IM read (2008/12/05)

    …IGITS_AS_NUMBERS kind of flag is the kind of thing I'd hypothesized if StrCmpLogicalW were folded into the built-in NLS collation functions, something I hypothesized way back i n2005 when I talked about What is up with number sorting? and related bl…

  83. I İ ı i before Ѐ ѐ unless you ask Y y ʏ (2008/11/26)

    …mponents like NTFS (the file system) or the registry. At the lowest levels they happen to use an ordinal/binary kind of collation, while at the most user visible levels (e.g. Windows Explorer, RegEdit) they use a linguistic one. The rule is that cir…

  84. Azeri serious bug for non-Unicode SQL Server columns? (2008/11/25)

    …d to run into some of them just in the act of walking around! They were telling me about an interesting bug in a couple collations in SQL Server 2008 -- bugs that only impact non-Unicode columns for those couple of collations. Apparently they got a …

  85. From I SCOOT to IBOT, #5 of ?? (sometimes it is phase 3 that is ????) (2008/11/24)

    …lier version of SQL Server to a later one). Plus I got many questions about Unicode (especially both UTF-16 and UTF-8), collation support in 2000/2005/2008, and also rich text support in SQL Server reporting services (drawing some on my typography k…

  86. When features collide (aka Your LCID sucks, but sometimes the bug sucks more) (2008/11/14)

    …e in red. Basically, the NORM_LINGUISTIC_CASING flag feature added in Vista does not work if you use the name-based NLS collation API functions added in Vista. Not as bad as the whole IsSortable() == false? Well, sometimes it may be lying.... situat…

  87. Trying to ignore the small stuff is harder, if you're Arabic (2008/11/11)

    …I tried with Latin1_General_CI_AI and Arabic_CI_AI and got the same result. My questions : is there a way to add my own collation to a SLQ 2005 server. Or is there a collation just ignoring *all* diacritics for every UNICODE character ? And why does…

  88. The sort order of the Language Bar (and Michael is in heaven on this one, other than...) (2008/11/03)

    …two in there. These days it does seem like the two subject areas I do the most advisory work in are still keyboards and collation, despite the fact that my actual job is so very different now. So it is interesting to get a question involving changin…

  89. What's the shape of the sort? (2008/11/01)

    …edge cases (since a SQL Server is involved) that you' want SQL Server type comparisons, which would be dependent on the collation of the SQL Server, either linguistic or the SQL Server notion of binary, which has no case ignoring facility. Am I the …

  90. How the @#%&*! does CBS_SORT choose to sort it all out? (2008/10/29)

    …e customer? Many thanks in advance This question is interesting on multiple levels, actually. First of all, it is about collation, which is always interesting to me. :-) Then second it is interesting for the solution, which is of course to take the …

  91. UCS-2 to UTF-16, Part 5: What's on the Next Level? (2008/10/15)

    …r the third category which I named as sort elements, since these are the kinds of things that would also tend to impact collation operations. Though the fact that a "collation dude" like me would call them sort elements while a "font dude" like Pete…

  92. UCS-2 to UTF-16, Part 4: Talking about the ask (2008/10/06)

    …the second and third categories (and if you include SQL Server 2008, the first category as well) is well handled by the collation support used by the engine for almost all of its comparison operations, even the ones it should not! This schizophrenic…

  93. A lot of problems to enumerate... (2008/09/22)

    …RIANT", but "LOCALE_INVARIANT" b) comparing binary key strings like "{\rtf" does not make sense using CompareString and collational compare - such a comparison has no meaning for to the binary stream that an RTF stream is. c) why CompareString does …

  94. Sorting the DPRK all Out (2008/09/19)

    …n a life time opportunity to know something that about language/script that he did not -- the North Korean/South Korean collation difference! :-) Many sources talk about the issue, though perhaps the clearest is in Chapter 9 (Information Processing …

  95. UCS-2 to UTF-16, Part 3: It starts with cursor movement (where MS simultaneously gets better and worse) (2008/09/18)

    …iders sort elements, where there is no easy function to call that finds those boundaries. The underlying data exists in collation algorithms (for example Microsoft's and the UCA's) and is used in order to define the sorting behavior of those element…

  96. What version of Unicode should we support? (2008/09/06)

    … to the planning process by which Microsoft or any company chooses what languages to add locale or rendering or font or collation or formatting or pasrsing or word breaking support for itsn't like comparing apples and oranges. It's like comparing ap…

  97. Collation backstory? (2008/08/28)

    …k.... Peter O. asks: Hi Michael, I read your stuff ... and learn lots. I have been wanting to learn background stuff on collations as a general subject. Can you point me to a succinct but clear book? Thanks. P I don't actually know of any good books…

  98. Making SQL Server operations slower (without explicitly trying) (2008/08/26)

    …mentioned in Making SQL Server index usage a bit more deterministic. Basically you have the interesting case where some collations are basically identical. Like if you are dealing with Unicode columns and you use two collations that are only differe…

  99. The super-cool panel about Windows, .NET, and SQL Server -- now live! (2008/08/19)

    …volved with while I was over at TechEd earlier this summer. The title? Internationalization and SQL Server: Sorting out Collations between Windows and .NET The title might sound difficult to navigate, but it isn't -- just plug in your high quality p…

  100. You must have heard wrong, Jesse\ I don't know about tailoring\ But about the algorithm Jesse\ That is used by Microsoft... (2008/07/06)

    …Jesse Hallam asks: Hi Michael, I've been scouring the net for some discussion of how one goes about tailoring a default collation table. Specifically, how does one correctly re-weight the table? ICU does it, but doesn't do a particularly good job of…

Showing results 1—100. Next ››