by Michael S. Kaplan, published on 2010/07/15 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2010/07/15/10038526.aspx
This one could have been called Every character has a story #32 since we are talking about either a new character in Unicode, or less disruptively a glyph change!
My first thoughts, being a geek who blogs, were to previous blogs of mine like The elusive G-sign said to exist in South America may not be in Windows, says a customer who has hunted for it about the Paraguay Guaraní currency symbol (₲, aka U+20b2, aka GUARANI SIGN)....and It is with a tenge of sorrow that I say this about the Kazhakstani currency symbol (₸, aka U+20b8, aka TENGE SIGN).
But India now has an official symbol for the Rupee.
Now the contest has been covered quite a bit, and Michael Everson did point people at the "false alarm" about how the symbol was going to be decided back on June 24th, pointing to Indian Rupee Symbol to be Decided Today that had the five finalists shown, and Rupee’s New Avatar, which showed 32 of the losing entries along with the five finalists.
His editorial comment about the five finalists
probably left little doubt as to his opinion:
I don't know if this is a joke or not, but none of those five is any good.
Evidently there is a desire to merge Latin R and Devanagari RA and then add stripes.
Well okay, then.
Anyway, they didn't decide that day. This didn't stop the thread on the Unicode List going on for another 35 messages.
And now, they have decided.
Ironically, most of the new articles describe the winner without providing a picture.
It looks something like this:
As the news articles like this one state "The new symbol for the Indian currency with the Hindi alphabet R with two lines is an true representative of the Indian ethos and culture" while Western news sites like this one state it a little differently with "The symbol is a fusion of both the ancient Devnagiri script, as well as the Latin script for the letter 'R', with one horizontal line running along the top and another through the middle.".
I guess everyone has their own viewpoint! :-)
From a technical perspective, the article I liked best was How will you type the new Rupee symbol, which has a very practical perspective on the whole practical issue of getting this symbol into Unicode, etc.
Though they did describe Unicode and ISO/IEC 10646 and properly described the Unicode Technical Committee and such, they misidentified the date of the next UTC meeting, saying it was in October when the next UTC (UTC # 124 / L2 # 221) meeting will be on August 9-13, in Redmond, WA, at Microsoft.
But no one gets the news 100% correct....
Anyway, lets hear it for
र
aka U+0930, aka DEVANAGARI LETTER RA and
R
aka U+0052, aka LATIN CAPITAL LETTER R, who are clearly the sponsors of this new symbol's upcoming inclusion in Unicode, even being a part of the final symbol themselves.
And unlike the previous two symbols I mentioned, it seems unlikely that this one won't make it everywhere in time for the next version of Windows (and maybe for prior versions in the fonts and/or Regional Options, though the latter is much harder than the former and both take a bunch of effort).
To be honest, almost all the new stories are overlooking the fact that there already IS a symbol in Unicode:
₨
aka U+20a8, aka RUPEE SIGN, and if Unicode were to treat this as a glyph change to the existing U+20a8 then everyone could be updated with the new font and get the new support on a much wider variety of platforms. It would still mean a Regional Options change in a bunch of locales since they mostly don't use this symbol (not even English - India, which uses three separate characters for Rs.), but this way the simple font update could be done much sooner and everyone could get it right away!
Now some may object to this, but this seems like the best idea to me. The symbol is in most of the Indic fonts already, plus other core fonts. This means very wide coverage. :-)
And whatever anyone may think about the syjmbol, it is Indian Institute of Technology postgraduate D Udaya Kumar who will be getting the prize money (Rs 2,50,000)!
{Removed an unclear bit in the prior sentence that Mr. Everson believes was putting words in his mouth. It wasn't, but removing it is easier.}
If I was the winner, I would want to delay getting paid until new bills with the new symbol were available, and then paid in cash so I could point to one of the bills and say -- just like those Windows 7 commercials -- "I love the Rupee and it was my idea!"
Michael S. Kaplan on 15 Jul 2010 3:24 AM:
The other interesting part is the script-specific Rupee signs like ৳ U+09f3 for Bengali and ௹ U+0bf9 for Tamil and so on. One has to wonder how many Rupee signs will we get! :-)
j on 15 Jul 2010 3:43 AM:
U+20A8, RUPEE SIGN can't get a (major) glyph change, since that would change the identity of the character. Indeed, it even has a compatibility decomposition to "Rs".
However, there are some Windows fonts that should change the glyph for U+20A8 from erroneously looking like "Rp" to looking like "Rs" (no full stop after those letters).
Joe on 15 Jul 2010 3:46 AM:
No need for a Unicode proposal, since it is already encodable.
As various press are reporting, it is a mix of the Devanagri 'Ra' and the Latin 'R'.
Sounds like a ligature to me! --> ZWNJ.
Just need an appropriate font.
Andrew West on 15 Jul 2010 3:53 AM:
As was pointed out on the Unicode list, the existing Rupee sign ₨ (U+20A8) is widely used outside of India (Sri Lanka, Nepal, Pakistan, Mauritius, Seychelles, Indonesia, Maldives all use the rupee according to Wikipedia), whereas the new symbol is (at present at least) intended for use with the Indian currency only, so it would not be appropriate to change the glyph of U+20A8. Furthermore, in my opinion, a complete change in glyph design (for any character) should result in a new character as otherwise it would invalidate a lot of existing data (archived material and electronic texts would suddenly have an anachronistic symbol in place of the "Rs" symbol that was intended). In any case, U+20A8 has a compatibility decomposition to "Rs", and you would not really want the new symbol to be burdened with that, would you?
John Cowan on 15 Jul 2010 4:06 AM:
Arrgh. I smell Hindi cultural imperialism here.
Michael S. Kaplan on 15 Jul 2010 4:08 AM:
We call that "Hindification" here, John. :-)
Pavanaja U B on 15 Jul 2010 5:03 AM:
So for the time being, can I put the new glyph at U+20A8, create a new font with this new glyph (or add the glyph to any font at this codepoint), and the unse MSKLC to add the keystroke? Of course ther things like locale support, DB, .NET, still pending.
Random832 on 15 Jul 2010 5:08 AM:
The precedent of U+20a0 (EURO-CURRENCY SIGN) vs U+20ac (need I say?) would seem to lean in the direction of encoding it separately.
Michael S. Kaplan on 15 Jul 2010 5:11 AM:
See the comments from j and Andrew West -- not recommended, no.
I never have the patience to read those long-winded threads on the Unicode List; I let people like Andrew digest and summarize useful comments!
Michael S. Kaplan on 15 Jul 2010 5:20 AM:
Random832 -- almost anything you do want to do or not do in Unicode, there is some precedent supporting it. That is why you always need more than that to prove the point. :-)
verdy.p on 15 Jul 2010 5:51 AM:
Currency signs are not letters, they are symbols. They are definitely not good candidate for enconding as a ligature, and certainly not in the Devanagari scripts that already has lots of complexity in managing only the ligatures between Devanagari letters only (including those created by use of the VIRAMA diacritic for controlling isolated vowels or half conjuncts) !
In all cases, this symbol MUST NOT form additional mandatary conjuncts with surrounding letters (notably Devanagari letters): this would occur easily if it was inserted in texts as a strange <Latin letter R, ZWJ, Devanari Letter Ra> (followed by VIRAMA? to make it use its half form alerady seen in conjuncts?)
On the opposite, the "Rs" symbol is really just an abbreviation and does not effectvely link to the INDIAN Rupiah (this abbreviation being used for various scripts or for other non-Indian languages or non-Indian currencies).
So the current encoding of the "Rs" symbol efectively has a "compatibility" decomposition into its component Latin letters (but this does not make it a true ligature, in the typographic meaning, it just indicates that it will collate within the same group as the two Latin letters "Rs", but it does not really indicate its true identity).
Remember that there was a legacy ligature for the European Currency Unit (ECU) and this was left in favor of a distinct character for the Euro currency symbol. There's absolutely no reason to apply the same policy for the new Rupiah currency sign (even if there's a symbol already "named" « Rupiah sign ». Note that the Unicode/ISO 10646 name does not really give the character identity, it's just a reference for naming it (in Latin-written discussions, or in technical standards or applications such as regular expressions) in a easier way than its assigned codepoint : in other words, this is definitely NOT a name, but a unique program identifier to treat equvalently as the reference to its assigned code point (also to help tracking the encoding process in technical commitees and balltots, before this code reachs the agreement and is assigned its final code point).
For almost all users and applications, this assigned name (which is quite assigned early during the discussions for adoption, and will be removed only if the proposal is finally rejected), means nothing and should not have to mean something.
So let's hope that this character will be mapped along with other currency symbols within the U+20A0..U+20CF block of currency signs (probably at U+20B9 because this is the first free position within the remaining 25 positions in this block), just like U+20B8 which was also the last added for the Tenge sign (used now in Kazakhstan), or like other unique symbols that were also added after the EURO SIGN (U+20AC):
- U+20AD KIP SIGN (used in Lao) : a Latin uppercase letter K barred with a single horizontal stroke
- U+20AE TUGRIK SIGN (used in Mongolia) : a Latin uppercase letter T barred with a single small oblique stroke
- U+20B2 GUARANI SIGN (used in Paraguay) : a Latin uppercase letter G barred with a single vertical stroke
- U+20B3 AUSTRAL SIGN (formerly used in Argentina) : a Latin uppercase letter A barred with a double horizontal stroke
- U+20B4 HRYVNIA SIGN (used in Ukrainia) : a Cyrillic uppercase letter H using the letterfom of the small italic letter, but rectified vertically and made bigger to look like a mirrored Latin uppercase letter S, barred with two horizontal strokes.
- U+20B5 CEDI SIGN (used in Ghana) : a Latin uppercase letter C, barred with a single vertical stroke
Other historical currency signs (no longer in use) have been added in the same block, so there should be absoluetly no problem for adding this symbols that will appear first on Indian notes and coins.
The main issue will not be the encoding, but the time needed to have this symbol added in updated fonts. I suppose that Microsoft will first update its main Devanagari font (Mangal for Windows).
The symbol by itself does not cause any design problem, it fits very well with the design of other similar symbols (notably the Euro, with which it shares a lot of typographic details, in the strokes of its official shape, and its design should fit really well with Devanagari letters and and its dimensions and alignement on the baseline should fit very well with Devanagari or Arabo-European digits).
And thinks to the Indian comitee, they chose to not adopt any one of the strange 5 candidates, which were too strongly linked to the Latin R, or used incoherent strokes. The winner is simply perfect in its meaning and we clearly see the top stroke as the distinctive sign that this shares the Indian Brhamic cultures and notably the Devanagari script. In addition it is much simpler to write by hand and remember.
And before computers get the appropriate font, there's still a way to reproduce it mostly correctly (in HTML at least) using the Devanagari letter RA (U+0930), with a CSS text-decoration for striking it
(just like it occured with the Euro with the first displays as letter C and an overtriking equal sign, and as it still appears for the Tenge sign, the Cedi sign...)
There will be about a billion people seeing and using this symbol very soon in everyday life in India (and many people elsewhere in the world will see it). There's absolutely no reason to reject it as the demand for it will be extremely strong.
Even if there's no system font for it, browsers can provide their own builtin font (or embedded SVG in browsers that support it in their renderer) for drawing it correctly (browsers now include such default font for commonly used signs, symbols or letters).
Michael Everson on 15 Jul 2010 6:00 AM:
I have drafted a proposal to encode INDIAN RUPEE SIGN at U+02B9.
The proposal also recommends changing the glyph for the DRACHMA SIGN to something that has actually been used.
Despite your suggestion above that I give a damn about the 2.5 lakh rupees, Michael, I congratulate D. Udaya Kumar for his winning design. It is much better than any of the five glyphs which were said to be shortlisted the other week.
Michael S. Kaplan on 15 Jul 2010 6:16 AM:
There was neither suggestion nor imputation that you "gave a damn" or not. I was just commenting that even if you did not care for designs people submitted (i.e. your "I don't know if this is a joke or not, but none of those five is any good.") that someone was going to be able to walk away with the Rupees for his Rupee symbol.
The rest is just insecurity about being criticized for being so critical, I guess?
Michael S. Kaplan on 15 Jul 2010 6:20 AM:
{Another comment from Philippe Verdy was removed for excessive length -- I and the people I quote are the only one who get to talk excessively on this blog. Sorry.... }
Michael Everson on 15 Jul 2010 6:27 AM:
I don't really suffer from insecurity, Michael.
Sometimes when you say something it can cause offence even if you didn't intend it. In this case, it has.
Oh, to correct a typo:
I have drafted a proposal to encode INDIAN RUPEE SIGN at U+20B9.
referenced by