Every character has a story #5 (U+262b FARSI SYMBOL)

by Michael S. Kaplan, published on 2005/01/29 18:03 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/01/29/363208.aspx


This character has an interesting history. As noted by Roozbeh Pournader:

Neither Farsi, nor a symbol. In real life, it is the official emblem of the goverment of the Islamic Republic of Iran.

Technically that would make it a logo and thus not a suitable candidate for encoding. But Roozbeh also noted:

Exactly. The funny fact is that it has been in Unicode since 1.0...

Truer words have ne'er been spoken. Luckily Ken Whistler stepped in to help explain the inconsistency:

And in Unicode 1.0 it was called "SYMBOL OF IRAN", which was closer to your description of its use. It was WG2 that insisted on renaming it "FARSI SYMBOL" to get "IRAN" out of the name...

P.S. I can feel another "Every Character Has a Story" story coming on...

Of course this does seem to violate the stability rules, which claim that once a character is encoded, its name will not be changed. Luckily Ken once again stepped up to explain:

Ancient history. Hundreds -- maybe thousands -- of Unicode 1.0 character names were changed in 1993 for Unicode 1.1 as part of the merger between the repertoires of Unicode and ISO/IEC 10646-1:1993. (The Great Compromise) The gory details of all the changes can be found in UTR #4, The Unicode Standard, Version 1.1. It was *after* that point (which was *very* painful for some people) that we put in place the never change a character name rule.

The whole reason for having a Unicode 1.0 Name field in the UnicodeData.txt file was to track that name change.

Now of course UTR #4 has been superseded and is not available online, though one would probably not learn much of interest since most of the fun/interesting parts about "The Great Compromise" are in the history and stories from those who were there, and that is not really captured. Think of it as being like the book of Acts in the New Testament -- many of the stories that would (in my very humble opinion) be really interesting about that particular period of time were not recorded, because the processes of change and compromise always tend to record information that speaks much more kindly about the experience than those who are there would themselves recall if you sat them down and bought them a beer....

Anyway, back to U+262b. Roozbeh gave some more information in a different thread:

...U+262B, the so-called FARSI SYMBOL, which is nothing but the official symbol of the (government of) Islamic Republic of Iran, with no known usage but this. It was specifically designed in 1979 or 1980 for this purpose, and also appears in the flag of the Islamic Republic of Iran adopted at the same time.

One insteresting
[sic] point is that it is not Farsi (Persian) in any way! It is a logo form of the Arabic word "Allah", also encoded at U+FDF2 ARABIC LIGATURE ALLAH ISOLATED FORM. Another interesting point is no one remembers exactly how it has got into Unicode! It has been there since the Unicode 1.0 days, so the source is definitely not an Iranian representative in SC2.

Another interesting point is that when the very final session for approving a very recent Iranian national standard, defining a minimum subset of Unicode for Persian information interchange, was being held, the committee experts voted for removing this character from the optional characters list (characters which need not be supported but their use should be according to the text if they are), telling that it's really not a character, but a logo: "It's not used in text, but just in letterheads".

Is anyone collecting notes to write that "Every Character Has a Story"  book some time? It's a good case for such a research! ;)

When the idea of that "Every Character Has a Story" book was being floated around, I remember suggestion a subtitle of "The Dark Underbelly of Unicode". Amazing how easy it is to get there when you look into the history of some characters....

And to date, no one (as far as I know) has come forward purporting to know how the "SYMBOL OF IRAN" was added to Unicode 1.0 (who proposed it, or why). Its source remains a mystery to this day....

 

This post brought to you by who else but "" (U+262b, a.k.a. FARSI SYMBOL, a.k.a. SYMBOL OF IRAN in Unicode 1.0)


# Michael Kaplan on 29 Jan 2005 4:11 PM:

For the record, there are 1,968 characters with different names in the Unicode 1.0 name field of UnicodeData.txt.

# Larry Osterman on 29 Jan 2005 6:05 PM:

Man, I LOVE these stories :)

Fascinating.

# Ken Smith on 29 Jan 2005 9:41 PM:

Who would you buy a beer? The Unicode guys or the Apostles?

# Michael Kaplan on 29 Jan 2005 10:56 PM:

I meant the Unicode people (had I meant the Apostles, I probably would have said something about wine).

And of course I am not quite a couple millenia too late for that particular kegger....

:-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day