Sorting it All Out: Search Archives

Search Sorting it All Out Archive

(or go back to index)

Search in posts:

Regular expressions are allowed, but play nice. Sorted descending by date published.


  1. Some things were -SO- much more impressive to know before Google made them -SO- easy to learn (2015/08/28)

    Technically the same point applies to Bing, but seriously, who are we kidding here?s filed under Collation/Casing because if Google can co-opt my passion about collation with their Alphabet re-org, then clearly their passion is the one…

  2. You need to dot every İ, not dot any I, dot every i, not dot any ı, and cross every t in Turkish (2013/04/04)

    …d! I'll add a few blogs of mine for additional supporting material and fun stories: The [Upper]Case of the Turkish İ (or: Casing, the 2nd) My first (bloggers) geek dinner and the sequelae, at Orlando TechEd 2005 "It makes total sense when you explain…

  3. I know I said 'µ' but I didn't really mean 'µ'. I meant 'μ', you know? (2012/04/25)

    … had seen before. They were looking for µ, aka U+00b5 aka MICRO SIGN. And unhappy that regular expressions that were uppercasing the text couldn't find the character again later. Of course they were assuming it was μ, aka U+03bc, aka GREEK SMALL LETT…

  4. I know I said 'µ' but I didn't really mean 'µ'. I meant 'μ', you know? (2012/04/25)

    … had seen before. They were looking for µ, aka U+00b5 aka MICRO SIGN. And unhappy that regular expressions that were uppercasing the text couldn't find the character again later. Of course they were assuming it was μ, aka U+03bc, aka GREEK SMALL LETT…

  5. The Truth was unexpected at this time (2012/02/24)

    …r talking to Mark Zbikowski a couple of times when he was still at Microsoft. Once was to get his opinion on updating the casing table, which had by that point grown to over 200 letter pairs missing from it that were assigned in Unicode. And the othe…

  6. Sometimes things are extended in the wrong direction.... (2012/01/31)

    SQL Server's code page, collation, casing, locale, and resource model are all direct attempts to extend the things that Windows provides in ways that make sense fo…

  7. There's no "I" in IDN, part 10: Who needs IDN support? How much? When? (Part 2) (2011/09/21)

    …col doc: 5. If the NetBIOS name from the LMHOSTS file is less than 16 bytes in length, pad the name with spaces, and uppercasing all characters within the ASCII range which results in ComputerName. Note that the letters A-Z are changed by an UPPERCAS…

  8. It's like a lower class of Lowercase... (2011/04/03)

    … whose General_Category is already "Lm". This change for the General Category property implies some changes for dependent casing properties. In particular, in order to keep the derived Lowercase property values unchanged, each of the twelve character…

  9. Turkmen! (for both Turkmen and women of Turkeminstan) (2011/02/23)

    …was intentional or not, but despite being based on the Turkish alphabet, the Turkmen locale on Windows does not do Turkic casing. If this is wrong, someone should tell us so we can fix it some day. Enjoy! John Cowan on 23 Feb 2011 8:33 AM: No, it's c…

  10. I swear the Romanian bug is fixed; it was fixed 4.5 years ago! (2010/12/13)

    …There are updates to fonts to display them, you can use MSKLC to create keyboards for them. But if you want collation and casing to work, you have to either: settle for the cedilla-below letters, or upgrade to Vista or Server 2008 or Windows 7 or Ser…

  11. Math is hard, let's do keyboards...that do math. (2010/11/29)

    … results in this blog. I did notice a few bugs in MSKLC itself that have never (as far as I know) been reported: The "AutoCasing" behavor that hooks up CAPSLOCK and shift state doesn't work with supplementary characters; When typing U+####[#][#] char…

  12. Kurdish with an Arabic Accent? (2010/10/01)

    …hnology and of sketchy manual labor. Now in the Latin script, Kurmanji does not have Turkic rules about either letters or casing. So perhaps the English sort or something like it is good enough for them -- or perhaps they are just using the thing mos…

  13. If case conversion were harder, people would do it less (2010/09/26)

    …e tr-TR is not in the default collation table? Well, there is a little confusion here since the methods involved refer to casing yet the question is about collation. And as I have mentioned before, Collation != Case (a.k.a. Collation <> Case). Though…

  14. It would be like spelling it Anerica or something. (2010/08/17)

    …retend you knew what I was getting at all along. :-) It is leaving the data alone and not screwing with it. Thus not uppercasing or lowercasing text just because you wanted to ignore case (and losing the information) and not normalizing to Unicode No…

  15. [Pretty much] All the things you can't do with SGCAPS, and why (2010/07/07)

    …ps-lock mode. Whereas the sane thing to do would have been to have a setting [maybe a "2" vs "1" as it is now] to use the casing table rather than the shift state to get the caps lock version of a key. Maybe even do it across the whole keyboard, for …

  16. Kazakh it to me, aka On being small and unique (2010/06/29)

    …s usually in the other direction; these EIGHT characters may be the only four case pairs whose relationship exists in the casing table but not in the collation table (only for Kazakh). If it is really true maybe there should be a Kazakh-specific entr…

  17. It would save a lot of time if I just stopped being right, aka About scuba diving, aka HELLo! (2010/06/22)

    …about to ship, etc. Like the problem I discussed in I know I'll Never say Never... again, at least with the change in the casing table with ς, aka U+03c2 (GREEK SMALL LETTER FINAL SIGMA). As predicted, nothing happened. A cow made a satisfied grunt s…

  18. Knock knock! Who's there? Kana! Kana Who? I Kana got something wrong! (2010/02/17)

    …various flags I get the following NORM_IGNORECASE: U+30e3 < U+30e4 LINGUISTIC_IGNORECASE: U+30e3 < U+30e4 NORM_LINGUISTIC_CASING: U+30e3 < U+30e4 NORM_IGNORENONSPACE: U+30e3 = U+30e4 LINGUISTIC_IGNOREDIACRITIC: U+30e3 < U+30e4 So, at least on Win2k8,…

  19. The i of the Turk, and the Turkey test (2010/01/14)

    …know the history of why Unicode didn’t create separate characters for the Turkish i? The i is only character that changes casing based on the language (Turkish/Azeri). I did a little searching online, but didn’t find any obvious references to the rat…

  20. Fastest? Perhaps. Wrongest? Certainly! (2009/10/22)

    ….. It is a matter of where/how you aim expectations. You see, starting in .Net 2.0 (and subsequent versions), "invariant" casing actually means "use the operating system casing results" and also "don't do those weird linguistic results". Thus just us…

  21. Turkish Character in Directory Name Hung Windows NT 3.5, aka small enough to be internationally stupid (2009/08/12)

    …either way (just like the one in the KB article). Then there was another one around the Windows XP timeframe. The Turkish casing results were added to collation and Internet Explorer was broken in some horrendous way. Well, not horrendous as compared…

  22. I know I'll Never say Never... again, at least (2009/07/30)

    So, thinking about consequences of the CASING piece of Every character has a story #33: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 2). And ignoring that we didn…

  23. Every character has a story #33: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 2) (2009/07/29)

    …et to part 3. Part 3: CASE The wheels are gonna come off the wagon a bit here, I will admit. Basically, it seems that the casing relationship between ẞ and ß was added, but only in the linguistic tables, and only to say that ẞ lowercases to ß. This m…

  24. Every character has a story #32: U+1e9e (CAPITAL SHARP S, Microsoft edition - Part 1) (2009/07/28)

    …and Mac OS) do. That is what the (German) users expect. # Michael S. Kaplan on 30 Jul 2009 1:10 PM: Microsoft does simple casing here too. I am willing to bet that within five years they will want simple (1 to 1) mappings to use the Capital Sharp S. …

  25. In search of the Swedish Tipping Point.... (2009/02/18)

    …009 10:56 PM: Hello Michael, A little off topic here! I spet practically the whole day trying to find the Windows default casing/collation tables that you talk about frequently on you posts. Where are these tables? Are they located in some file in th…

  26. I İ ı i before Ѐ ѐ unless you ask Y y ʏ (2008/11/26)

    …way one looks at I İ ı i U+0049 U+0130 U+0131 U+0069 and the fact that sometimes the first and fourth items there are the casing pair, and other times it is the first/third and second/fourth that are. You can imagine plugging those strings right into…

  27. When features collide (aka Your LCID sucks, but sometimes the bug sucks more) (2008/11/14)

    …entitled New in Vista Beta 1: more use of the word 'linguistic', which described (among other things) the NORM_LINGUISTIC_CASING flag -- a flag to do proper casing for Turkic languages. Turns out there is a problem getting these two features to work …

  28. What's the shape of the sort? (2008/11/01)

    …å (U+00e5, aka LATIN SMALL LETTER A WITH RING ABOVE), as I explain in an earlier blog (CompareString ignores case by lowercasing). But if on the other hand the code page in question is Windows code page 1255, then in all the above cases if you have t…

  29. The difference between Six Sigma and Sigma Diaresis is one must never fail; the other seems to do so by default (2008/09/29)

    …more to do with the fact that there was nothing there to make them shape more harmoniously together (ref: Sometimes, uppercasing sucks), and that maybe if the right glyphs were added here then everyone would be okay with the diacritics. You could the…

  30. What's in a name? (2008/08/08)

    …be forced to go by Degraw due to a system not remembering the case of letters in the name -- which sucks -- or worse titlecasing -- which also sucks). And then there are readers of this blog like Gé van Gasteren and Jeroen Ruigrok van der Werven, bot…

  31. Seeing the tears, my heart went out to her as I asked her "Why the Long S?" (2008/06/25)

    Over the last few years, quite a few of my blogs have mentioned the LCMAP_LINGUISTIC_CASING flag for LCMapString: December 2004: The [Upper]Case of the Turkish İ (or: Casing, the 2nd) December 2004: What does "lin…

  32. Reliving life in the time before ASCII? (2008/06/14)

    … of doing things in Unicode, whether one if thinking about normalization or combining classes or canonical equivalence or casing or properties -- really anything -- you can look at Unicode and see many of the wear marks as you see things done one way…

  33. On the fear of [consequences of ]rejection (2008/05/25)

    …ful assumption. A good example came through the other day: Reviewing the info on the Unicode website, we see different up-casing rules, based on locale (i.e. Greek, Hungary). We also know customers can install different MUI packs on Windows and chang…

  34. İn tıtlıng thıs ınclusıon ın re: the ınterests of Turkısh İSVs, am İ just tryıng to buıld İ's and ı's ınto the tıtle of thıs daıly contrıbutıon to SİAO (SıaO), amıgo? (2008/05/12)

    … -> Turkish_Turkey.1254 ığüşiöç ığüşiöç IĞÜŞİÖÇ IĞÜŞİÖÇ The issue boils down to the very simple fact that the C runtime's casing functions are being used underneath this code, and LCMapString is being called underneath that. They are passing the Turk…

  35. Kind of ironic how Germany seems so okay with Capital *Letter* punishment, huh? (2008/04/15)

    … otherwise, the invention of letters that do not actually exist is quite powerful, as is the decision to ignore intuitive casing behavior or make unrealistic case mappings. Unicode has been doing it for some time and they seem pretty popular. The who…

  36. Newer, stronger, more case pair stability! The world's first 5.1 million dollar character encoding standard! (2008/02/29)

    …see, this was done largely at the request of Microsoft. It was really due to the fact that both Unicode and Microsoft had casing stability rules that were not entirely compatible, a fact that could easily lead to future problems with Microsoft moving…

  37. The idea has to do more than just make sense to me (aka How S-Sharp are *you* feeling today?) (2008/02/24)

    …at in the collation tables, U+00df and U+1e9e would simply be case variants, with no real choice in the matter. And as to casing.... Now just because we make the relationship in casing does not mean we make it in collation. After all, as I have point…

  38. No upproblems with $UpCase when you do a upVista upgrade (2007/12/27)

    …desired, the question shows up in the Suggestion Box: You wrote ages ago (in August 2005) New in Vista Beta 1: Updated OS casing tables This mentions of $UpCase, the file containing upper case conversion table(s). Now consider, I have Windows XP inst…

  39. No way to get that script info I was looking for earlier (2007/12/05)

    …group that owns the class is the one most likely to be watching for Unicode updates! In truth, RegEx uses the .NET and OS casing tables and the property data, which is updated to Unicode 5.0 right now in .NET >= 2.0 qnd Vista, so this mixed behavior …

  40. All things being equal, your mom probably has an easier time with case insensitivity than not (2007/10/26)

    …and create case discrepancies, and it is very hard to train them not to do that. (I am learning how to enforce consistent casing through the tool chain, but it is all by being obsessive about it.) In the few cases where there is no single spelling th…

  41. Jokes that aren't really all that funny in the end (aka At least SQL Server isn't on our case) (2007/10/25)

    …ects the actual data.... It is for this reason that they have always carried their own data around for both collation and casing operations, and have since Sphinx (SQL Server 7.0) through two different collation models (Sphinx and post-Sphinx) and on…

  42. Getting the preserved case of a file wHeN tHe uSeRs tYpE wHaTeVeR tHe hElL tHeY wAnT tO (2007/10/24)

    …tXt” and the file on disk is stored as “foo.txt”. I can certainly open the file using “FoO.tXt” or “foo.txt” or any other casing I choose. But how do I get the original casing? The solution can be found in a call to System.IO.Directory.GetFiles(Strin…

  43. In Case you have problems that you might think are ǸȦȘȚȲ (2007/10/24)

    …jump drives that are 500mb each, and format them both with NTFS (one in XP and one with Vista -- the one with the updated casing table).... Now we will take those ten characters above and combine them in various five character blocks, creating files …

  44. Traditional to Simplified or vice-versa? According to Windows, you're on your own.... (2007/10/22)

    …ore sophisticated dictionary solution in Word that does a better job with the sometimes complicated mapping. Now although casing, width, and Kana mappings can all be done in place, this is not allowed for traditional->simplified Chinese mappings, eve…

  45. It looks like you are using Google anyway.... (2007/10/09)

    … but due to a side effect of the way Community Server takes care of tag nesting, you know, like if you click on Collation/Casing you get this link: http://blogs.msdn.com/michkap/archive/tags/Collation_2F00_Casing/default.aspx and then if inside that …

  46. What's wrong with what FxCop does for globalization, Part 1 (2007/10/01)

    … when sorting lists String.ToUpper – Use ToUpper rather than ToLower, and specify InvariantCulture in order to pick up OS casing rules Determining BiDi languages, for example to load correct icons – Continue to specify CurrentUICulture for these case…

  47. A&P of Sort Keys, part 6 (aka Relax, be calm, and deCOMPRESS if you are feeling out of sorts) (2007/09/16)

    … not true.... Also, as long as we are collecting COMPRESSION factoids, we have specific convention for how they work with casing, as I mentioned in A Microsoft convention for compressions in sorting -- they can be UU, LL, or UL, but not LU.... Oh, an…

  48. Head checks containing either comparison or case validation BITE (2007/09/05)

    …ct okay to BITE here? Well, no, it isn't. :-) So let's take some other scenarios, like accessing network shares where the casing table is different (remember that it is 5.0 in Vista as I mentioned e.g. here and that the share could be pointing to a U…

  49. Head checks containing either comparison or case validation BITE (2007/09/05)

    …ct okay to BITE here? Well, no, it isn't. :-) So let's take some other scenarios, like accessing network shares where the casing table is different (remember that it is 5.0 in Vista as I mentioned e.g. here and that the share could be pointing to a U…

  50. Elegant? Beyond compare... (2007/08/31)

    …work with managed classes but others could welcome unmanaged way as well.) From some of your posts it is clear that lower-casing is better than upper-casing, since there are lower case characters without upper case equivavalents. Also StringCompariso…

  51. Every character has a story #28: U+1e9e (CAPITAL SHARP S) (2007/08/24)

    …at in the collation tables, U+00df and U+1e9e would simply be case variants, with no real choice in the matter. And as to casing.... Now just because we make the relationship in casing does not mean we make it in collation. After all, as I have point…

  52. If this post really describes a bug, would I actually put it in the WYNN column? (2007/07/31)

    …re near the lowercase version. And our default table does not put them anywhere near each other, either. Of course in our casing table update in Vista, both characters are there and map to each other (prior to Vista they did not; this is one of the m…

  53. The difference between 'Dangeous Characters' and 'Dangerous Minds' is the lack of Michelle Pfeiffer (2007/06/12)

    …le problems with I would be, setup files staring with I, any registry entries that contain uppercased I, auto upper/lower casing in apps. Japanese9320x5c Characters - ソ十申暴構能 0x5f Characters (DBCS) - 雲契活神点農 0x7b Characters - ボ施倍府本宮 0x7d Characters - マ…

  54. Azeri zeriouz LIP releaze (2007/06/02)

    …ijani spelling rules in Azerbaijan, therefore "Bush" for example becomes "Buş" in newspapers. Like Turkish, the different casing rules for I/ı and İ/i are used that so frequently cause problems in code not smart enough to understand the language Clas…

  55. Every character has a story #26: CAPITAL SHARP S (might be encoded?) (2007/05/03)

    …roposed encoding solution will not affect existing data or implementations. Mind you, it will give rise to a long default casing chain : Capital Sharp S lowercases to Small Sharp S, which upper cases to "SS", which lowercases to "ss".…

  56. 'The 44' (*not* 'The 4400') (2007/04/26)

    …characters since it attempted to lowercase first. That RegexOptions.IgnoreCase is just a nightmare! Interestingly, the OS casing table combined with a non-invariant culture (which is not possible in the .NET Framework today) would have picked up many…

  57. No Regex in the Unicode room! (and no sex in the champagne room, either!) (2007/04/26)

    …etter and we are trying to match Ll. Ah, now it all came together. Well, if you are running on Vista and have the updated casing table then they will work. But otherwise, when you are not running on Vista, the casing table does not cover all of Unico…

  58. The nature of OrdinalIgnoreCase vs. intuitive expectations (2007/04/25)

    …e just upper/lower case, should the 2nd comparison return 0 as well? Thanks! Patrick Well, Patrick is right about the way casing in Turkic works (ref: The [Upper]Case of the Turkish İ (or: Casing, the 2nd)). But one of the core ideas in both Ordinal …

  59. In case of conflict, dump the indexes! (a.k.a. A day in the life of Jet) (2007/04/17)

    …sensitive, but due to a fortuitous and unintentionally well-timed bit of data/code theft, it fully supports proper Turkic casing (discussed previously). So, with a database created with Turkic collation, you can put U+0069 and U+0049 in the same colu…

  60. A checkpoint on search engine results contrast, ultimately leading to an irreverent bit of exposition (2007/03/17)

    … it is no doubt quite thrilling to know how deep the adoption of the AppLocale utility has been, I don't know whether showcasing AppLocale's effective use in this case is the most productive of possible exercises. Though I will quote Cygon's descript…

  61. But is it mnemonical? (2007/03/09)

    …obably have to share my thoughts on that at some point, I suppose.... Dean Harding on 10 Mar 2007 9:34 AM: Randomly upper-casing the First letter of Some words make You sound like some Ye Olde Manuscript. Juste got to Starte throwing in a Few extra E…

  62. The T's are crossed, but not all of the I's are dotted... (2007/03/06)

    …_CHAR flag, mapping the virtual key to a character The resulting string is then uppercased, though unfortunately the uppercasing is done without using the LCID attached to the KLID and the LCMAP_UPPERCASE | LCMAP_LINGUISTIC_CASING flags. This means t…

  63. He had the strength of an OX[IA], I tell you (2007/02/18)

    … U+03cd (GREEK SMALL LETTER UPSILON WITH TONOS), which provides a better route for support in both fonts and in collation/casing. Though I am not claiming that an OXIA is always a TONOS (I believe there are specific meanings in a linguistic framework…

  64. Getting all of the localized names of a font[.NET] (2007/01/22)

    …IgnoreCase comparisons to find the unique names (this is okay given the case insensitivity of font names following the OS casing rules, though is likely not completely necessary given the lack of font name localizations that only vary in case!) as I …

  65. If you are more sensitive, you'll pick up on more problems (2006/10/30)

    …gramming language cases, where (for example) it is not uncommon to see developers who use specific conventions like "lowercasing of internal variables/parameters vs. proper casing/camel casing for public properties/methods" where the "collision" one …

  66. When _wcsnicmp can't hack it, CompareStringW delivers (2006/10/23)

    …with _wcsnicoll one must either set the locale appropriately (it starts up with the "C" locale which only handles A-Z/a-z casing, which will never match lstrcmpiW), or else call the new _wcsnicoll_l, which allows you to pass the locale you wish to us…

  67. Stuffing the ballot box? (2006/10/08)

    … in random email!). There are a whole bunch of tags there, from Int'l Programming to Keyboards to Linguistic to Collation/Casing to Locales/Cultures to Unicode/Standards and maybe others that are there pretty much because of me (and clicking on them …

  68. NTFS and Unicode? (2006/09/24)

    …ne in Windows XP. # Michael S. Kaplan on 25 Sep 2006 11:05 AM: Nope, normalization still does not happen, and the updated casing table is limited to the BMP.... # WikiServerGuy on 25 Sep 2006 4:43 PM: I'm a bit surprised Wikipedia is used as a refere…

  69. Not just uppercasing or italicizing; bolding can cause problems too! (2006/09/14)

    Regular readers may recall the following five posts: Sometimes, uppercasing sucks A localizability problem is an application bug, or alternately: Ρύθμιση σήματος If you wanted to get it done with t…

  70. If you wanted to get it done with the font... (2006/08/30)

    It wasn't all that long ago that I was talking about how Sometimes, uppercasing sucks. And between that post and the follow-up one I was showing example strings like Ρύθμιση σήματος which a native woul…

  71. A localizability problem is an application bug, or alternately: Ρύθμιση σήματος (2006/08/22)

    …ext in it, to help highlight a bug that will be explained later in the post!) I had no idea when I posted Sometimes, uppercasing sucks that I'd find so many people who were unaware of how much of this sort of 'natural language processing' wasn't happ…

  72. Sometimes, uppercasing sucks (2006/08/18)

    Case differences in casing scripts (Latin, Cyrillic, Greek, Armenian, Ecclesastical Georgian, Coptic, Glagolitic, etc.) ought to be easy. But it's n…

  73. Collation != case, still (2006/08/08)

    … de-DE locale specified: "\u00DF".ToUpper(CultureInfo.GetCultureInfo("de-DE")) does not return "SS", but "ß"? The Unicode casing file CaseFolding.txt has 00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S Is this a Window's limitation? (Which would not…

  74. Sometimes, ignoring case is stupid (2006/08/01)

    I talked about some of this once before in CompareString ignores case by lowercasing...., as you may recall. There are some people who think of ignoring case as required to keep things from sorting in the '…

  75. The name of the enum is KeysEx, dammit (2006/07/26)

    …here something else going on here? [Update 8:36am] At least (as Raymond proves), some developers still use the clues that casing provides -- although a bit of language knowledge can at time interfere! This post brought to you by ዽ (U+12fd, a.k.a. ETH…

  76. More on case insensitivity and its intuitivality (2006/06/05)

    …le more information here to paint the full picture.... On the surface this would suggest that Apple's OS only handles the casing of ASCII letters, but that would be misleading due to several related facts: They support Unicode names (cf: Apple TN2078…

  77. Intuitivosity (intuitivality?) of case insensitivity (2006/06/04)

    …t: a very large percentage of Windows C/C++ programs is probably written to include <windows.h>. MFC, at least, uses this casing. The Platform SDK, interestingly enough, does not: it's actually Windows.h. Chances of working on a case-sensitive system…

  78. Does it make sense to have 'Case' more accessible? (2006/05/28)

    …s the problem of how the text would be changed by the "Change Case" item -- upper? Or lower? Or reversed from the current casing? Obviously there would have to be separate operations -- multiple menu items for at least the upper and lower choices. Th…

  79. Custom Case Mappings? (2006/05/26)

    …ing. :-) Clearly the intent is to try to show some of the ways that cultures customize behavior surrounding collation and casing. And the target audience seems to primarily be people who are native English-speaking folk who have not had previous expo…

  80. Some strings need to feel validated (2006/04/17)

    …mail, view in Notepad, etc.), but before that happens I want to combine it with other strings (but no sorting, searching, casing, etc.). Thus, I don't want to immediately convert the byte[] to Base64 and then add another Base64 conversion on top of t…

  81. Would some keyboard code samples be helpful? (2006/03/22)

    …tion of some special keys of some international keyboards (Japan for example) programatically. So there is always special casing involved (I hate those)…

  82. Just in case you thought I was blathering. :-) (2006/03/18)

    Making sure that you use consistent casing behavior is really important. A good example is a bug that is fixed in the CTP release of SQL Server 2005 SP1. As mention…

  83. Casing and IgnoreCase are still not the same thing.... (2006/03/15)

    … Case. And more importantly in this case, ignoring case in a linguistic comparison is not exactly the same as either uppercasing a string to make a binary comparison or lowercasing it. Perhaps the best way to look at it is to realize that ignoring ca…

  84. CompareString ignores case by lowercasing.... (2006/03/02)

    …areUnicodeString and CompareStringOrdinal) for the filesystem and the registry and environment variables and such by UPPERCASING prior to comparison. But in the past here when I have talked about linguistic comparisons via CompareString, and especial…

  85. Ā was unexpected at this time. (2006/01/30)

    …hird most "senior" employee working at Microsoft today. Even though I had actually conversed with him before (about those casing table and NTFS issues!), I did not want to waste his time or take advantage of the fact that he is by all reports and in …

  86. Collation != Case (a.k.a. Collation <> Case) (2006/01/11)

    …o show tertiary distinctions. There are many 'one way' case conversions in Unicode that would not be candidates for an OS casing table (which generally needs data that is round-tripable) but which are just fine in collation. There are also many compl…

  87. It may seem like a bug, but it is not.... (2005/12/14)

    …alIgnoreCase is not ignoring case in a language-specific way, so there is no way that language differences can affect the casing part of the operation. To make a further somewhat related point, currently in Microsoft products there is no casing opera…

  88. Getting on people's case (2005/10/30)

    …yone should just do whatever they want and then if they want case insensitivity come back talk to me about how to get the casing figured out.... But what I think is funniest about one of the VB stances on the whole matter is that they want case insen…

  89. How best to alter case (2005/10/18)

    …Comparing Unicode file names the right way, Mike Dunn asked the following question: When using LCMapString to do the uppercasing, what's the right LCID to use? LOCALE_INVARIANT? The CharUpperBuff docs say it "uses the language driver for the current …

  90. Comparing Unicode file names the right way (2005/10/17)

    …is not using Unicode normalization or a comparison method that tries to respect the equivalences it creates. For the uppercasing operation, you can use CharUpper, CharUpperBuff, or LCMapString with the LCMAP_UPPERCASE flag (and without the LCMAP_LING…

  91. Every character has a story #15: CAPITAL SHARP S (not encoded) (2005/09/25)

    …ubious new German character, without a demonstrated need, but with a horrendous downside potential for screwing up German casing implementations. My prediction is that the UTC is quite likely to turn this one down flat, without a single member in fav…

  92. The casing table case (.NET) (2005/09/22)

    …for ignoring case in comparisons matches the behavior of the OS. Is that really true? I was under the impression that the casing tables were defined on each partition, for NTFS. (and that you've blogged about this tiny difference before) I thought it…

  93. New in Vista Beta 1: Updated OS casing tables (2005/08/20)

    … Well, there is another set of tables that needed such an update -- tables that have not been updated in a long time. The casing tables. Over time this just gets worse, especially as most of the characters that have been added to Unicode have made th…

  94. Every character has a story #13: U+0241 and U+0294 (upper and lower case glottal stops) (2005/08/17)

    …lottal stop to glorious *un*cased status. Which is the way it is used in IPA and in other Canadian orthographies: as an uncasing character ignored e.g. in title casing. >...convince Europeans that "nothing" could be a *real* letter -- merely by makin…

  95. New in Vista Beta 1: more use of the word 'linguistic' (2005/08/02)

    …ch of people who read about the odd use of the word linguistic when I answered that other question (What does "linguistic casing" mean?). At the time I proposed that we could have called the LCMAP_LINGUISTIC_CASING flag LCMAP_UNICODE_SIMPLE_CASING an…

  96. Ignore this blog (and this post) (2005/07/14)

    …ay, but that kind of motivator is not generally one that drives attendance to the PDC, which usually seems more about showcasing technologies that are coming so people can understand the landscape, and the important upcoming stuff. On top of that, ev…

  97. LCMapString's *other* job (2005/06/24)

    …cular convention, though it is fascinating to contemplate doing so some day, in some kind of extension to the "linguistic casing" notion I'll talk about in a bit. Interesting feature idea, if it truly is the convention. :-) LCMAP_UPPERCASE -- Maps lo…

  98. Not so small as to be internationally stupid (2005/06/23)

    …Turkish 'i' today!" If you read here at all, then I probably don't have to explain why this irked me. :-) The issues with casing and Turkic I have become kind of a symbol to some people -- a symbol of writing code that is not stupid, from an internat…

  99. The dasBlog 'Turkish I' thing figured out (2005/06/05)

    …n Globalization. Now I have talked about the Turkish I issue in general before (cf: The [Upper]Case of the Turkish İ (or: Casing, the 2nd)), but I will post some more on this soon. But Scott's object lesson here is even cooler -- you can see how easy…

  100. The New String recommendations (2005/06/02)

    …ant (some of them were I daresay inspired by things I have been saying here about invariant versus ordinal and using uppercasing!). And I think at the core of those recommendations is a principle that applies to all code that is written, managed and …

Showing results 1—100. Next ››