Notepad folds digits like origami birds

by Michael S. Kaplan, published on 2005/01/27 02:31 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/01/27/361413.aspx


Chris Walker mentioned to me yesterday something I did not know about Notepad -- that it uses the FoldString API with the MAP_FOLDDIGITS flag. This takes all of the digits in Unicode and folds them down into regular old zero to nine for everything you type something into the "Goto line" dialog (you can get to it by typing <CONTROL+G> or choosing "Edit|Go To..." from the menu).

Note that there is really no connection to origami, the art of paper folding in this post. But I started playing with this and decided it was really kind of cool.  And then I was thinking about how Notepad looked like a big sheet of paper and how the piece of paper was folding digits, and how funny it was that paper would fold rather than being folded. Then I put http://origami.com/ into my browser and the title took on a life of its own....

Now this whole digit folding in Notepad is very much a "stealth" feature, since it is not in the online help or any documentation that I was able to find. At first this annoyed me but then I thought about how hard it would be to document this feature to people who do not realize that some languages have their own digits. Or that it would probably not be necessary to document it for those who do (since people who do use them would probably try it anyway and just be pleasantly surprised that it works!). It is not a feature that is going to specifically convince people to buy Windows and I cannot see anyone truly believing that Windows is more internationalized because of this feature. So perhaps the fact that it is not documented is not too bad (though I am happy to point it out now).

Obviously it is most useful to someone who uses some other language that has its own set of digits like Arabic or Thai, but you can play with it to by grabbing some of the digits below into the clipboard and watching the "Goto Line" functionality work its magic.

It is probably most fun for when you do not have the fonts available -- since you will just see square boxes for those entries. Yet the API does not discriminate based on your machine's available fonts, so neither does Notepad. :-)

If you have no patience for this sort of thing, then I apologize. You can stop by tomorrow and I'll try to catch your interest....

UNICODE SCRIPTS USED BELOW FOR EACH NUMBER:

See how easy it is to get a program to recognize different digits? :-)

٠۰⁰₀⓪00                        (0660 06f0 0966 09e6 0a66 0ae6 0b66 0c66 0ce6 0d66 0e50 0ed0 2070 2080 24ea ff10 0030)
١۱¹₁①1⑴⒈❶➀➊1       (0661 06f1 0967 09e7 0a67 0ae7 0b67 0c67 0ce7 0d67 0e51 0ed1 00b9 2081 2460 ff11 0be7 2474 2488 2776 2780 278a 0031)
٢۲²₂②2⑵⒉❷➁➋2    (0662 06f2 0968 09e8 0a68 0ae8 0b68 0c68 0ce8 0d68 0e52 0ed2 00b2 2082 2461 ff12 0be8 2475 2489 2777 2781 278b 0032)
٣۳³₃③3⑶⒊❸➂➌3     (0663 06f3 0969 09e9 0a69 0ae9 0b69 0c69 0ce9 0d69 0e53 0ed3 00b3 2083 2462 ff13 0be9 2476 248a 2778 2782 278c 0033)
٤۴⁴₄④4⑷⒋❹➃➍4  (0664 06f4 096a 09ea 0a6a 0aea 0b6a 0c6a 0cea 0d6a 0e54 0ed4 2074 2084 2463 ff14 0bea 2477 248b 2779 2783 278d 0034)
٥۵⁵₅⑤5⑸⒌❺➄➎5  (0665 06f5 096b 09eb 0a6b 0aeb 0b6b 0c6b 0ceb 0d6b 0e55 0ed5 2075 2085 2464 ff15 0beb 2478 248c 277a 2784 278e 0035)
٦۶⁶₆⑥6⑹⒍❻➅➏6   (0666 06f6 096c 09ec 0a6c 0aec 0b6c 0c6c 0cec 0d6c 0e56 0ed6 2076 2086 2465 ff16 0bec 2479 248d 277b 2785 278f 0036)
٧۷⁷₇⑦7⑺⒎❼➆➐7      (0667 06f7 096d 09ed 0a6d 0aed 0b6d 0c6d 0ced 0d6d 0e57 0ed7 2077 2087 2466 ff17 0bed 247a 248e 277c 2786 2790 0037)
٨۸⁸₈⑧8⑻⒏❽➇➑8   (0668 06f8 096e 09ee 0a6e 0aee 0b6e 0c6e 0cee 0d6e 0e58 0ed8 2078 2088 2467 ff18 0bee 247b 248f 277d 2787 2791 0038)
٩۹⁹₉⑨9⑼⒐❾➈➒9    (0669 06f9 096f 09ef 0a6f 0aef 0b6f 0c6f 0cef 0d6f 0e59 0ed9 2079 2089 2468 ff19 0bef 247c 2490 277e 2788 2792 0039)


 

This post is sponsored by all of the numbers listed above (although those in the second row did point that they are #1 quite a bit before agreeing!)


# Per Jönsson on 27 Jan 2005 1:50 AM:

Oddly enough, for some of the digits (in particular the ones to the right in every row) Notepad just says "Line number out of range", and when I click OK, the line number field changes to the value 175 (which is the last line of the file).

Am I missing something?

# Michael Kaplan on 27 Jan 2005 2:09 AM:

You mean the numbers in parentheses that have the code points in them? Well, those numbers would require the file to be big enough....

Or do you mean the digit values like the 9s in the last row?

# Jochen Kalmbach on 27 Jan 2005 3:17 AM:

Great post! Thanx!
Is there an counterpart in the .NET-Framework or do I have to use this function via P/Invoke?

# Michael Kaplan on 27 Jan 2005 3:21 AM:

There is not currently a managed way to get at this, sorry. :-(

# Simon Montagu on 27 Jan 2005 3:54 AM:

It seems as if this works for characters with the "decimal digit" property in Unicode but not for those with only the "digit" property (e.g. all the various circled and parenthesized forms).

It also doesn't seem to work with characters not in Unicode 3.0 (at least in the version of XP without SP2 I have here)

# Barry Kelly on 27 Jan 2005 5:27 AM:

When Notepad is able to deal with Unix line endings I'll consider it usable. That's a far bigger turnoff; this feature doesn't compensate.

# Michael Kaplan on 27 Jan 2005 6:57 AM:

Simon --

Well, at the point I will admit that the list is not 100% the product of Unicode properties, except by accident....

And the list is shorter than it ought to be -- but that is something to address in the next version (adding all of the data up to and including Unicode 4.1).

# Michael Kaplan on 27 Jan 2005 7:02 AM:

Barry, I hear you. I even have covering this issue in a future post (cf: http://blogs.msdn.com/michkap/articles/343583.aspx#360863 ).

But I was not trying to sell Notepad, since (a) I am not in marketing, (b) you cannot buy Notepad so there is no point in trying to sell it anyway, and (c) as I said in the post, I would not expect this feature to be a way to sell copies of Windows.

I'm just a guy who points out things that interest me, on the theory that it may interest some of you. :-)

# Michael Kaplan on 27 Jan 2005 7:04 AM:

Jochen --

A little more info on the managed side -- in Whibbey there is a new CharUnicodeInfo property which CAN get all of this info -- you can ask for the Nd (Numeric, Digit) property which can get you the numeric value.

So while you cannot map to the ASCII digits, you can get the numeric value behind digits that are numbers.

# Alex Fung on 27 Jan 2005 5:31 PM:

I wonder why it cannot accept the Chinese
壹貳參肆伍陸柒捌玖零 (1..9 and 0)
or
一二三四五六七八九零 (again 1..9 and 0)

For those who are interested, Chinese happens to have two sets of digits. We use the 2nd set most of the time. Most people do not remember how to write the strokes of the first set at all, though they can read them without trouble.

I remember in one occasion I was in a bank preparing a cheque. I asked the teller "how does one write the digit 'one'?" and she stared at me in wonder...

# Michael Kaplan on 27 Jan 2005 8:14 PM:

The general category of all 20 ideographs is Lo (Letter, Other) -- so thy are not treated as digits.

# Per Jönsson on 28 Jan 2005 5:17 PM:

Michael,

No, I don't mean the hex numbers. Take the fourth row, for example. The only digits that work are those to the left of the superscript four, and the two ordinary Latin fours to the right of it.

# Michael Kaplan on 30 Jan 2005 1:42 AM:

Ah, I see it now.... weird. They do fold down, it must be some sort of additional check Notepad does. Cool. :-)

referenced by

2007/02/19 Ever wonder why no one bothers to fold digits?

2005/01/28 A little bit about the new CharUnicodeInfo class

go to newer or older post, or back to index or month or day