Every character has a story #2: U+234a U+234e U+2351 U+2355 U+2361

by Michael S. Kaplan, published on 2005/01/11 02:07 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2005/01/11/350460.aspx


The five characters in question are APL Functional Symbols (if you have either Arial Unicode MS or Code2000 on your machine, you will see the characters; if not then you can look for them in the Miscellaneous Technical block on the Unicode site):

   U+234a   APL FUNCTIONAL SYMBOL DOWN TACK UNDERBAR (has annotation of " =up tack underbar")

   U+234e   APL FUNCTIONAL SYMBOL DOWN TACK JOT (has annotation of " =up tack jot")

   U+2351   APL FUNCTIONAL SYMBOL UP TACK OVERBAR (has annotation of " =down tack overbar")

   U+2355   APL FUNCTIONAL SYMBOL UP TACK JOT (has annotation of " =down tack jot")

   U+2361   APL FUNCTIONAL SYMBOL UP TACK DIARESIS (has annotation of " =down tack diaresis")

All five characters are misnamed by most intuitive judgments (which caused the "clarifying" annotations to be added).

It also does not compare to four other characters that are also in the Unicode standard and which get used by APL (if you have either Arial Unicode MS or Code2000 on your machine, you will see the characters; if not then you can look for them in the Mathematical Operators block on the Unicode site):

   U+22a2   RIGHT TACK

   U+22a3    LEFT TACK

   U+22a4    DOWN TACK

   U+22a5    UP TACK

More information on how the competing naming conventions (sometimes known as the Bosworth and London conventions) can be seen here. However, there are many within the Unicode Technical Committee who dispute that web site's claim of the causes behind the inconsistency within Unicode. Some more knowledgable sources who were actually at the meetings:

From Mike Ksar (convenor of WG2, the body that runs ISO-10646):

Here is a cut and paste copy of resolution M27.7 in document N1204.txt with respect to this issue:

“RESOLUTION    M27.7 (defect report disposition):      Unanimous
WG 2 instructs its editor to prepare editorial corrigenda on the following items:
 
    a. With reference to concerns expressed in N1044: FFE3 FULL WIDTH MACRON*
       with a corresponding entry in the Annex on Additional Information on Characters.
 
    b. Follow the instructions to the editor contained in N1193.
 
    c. With reference to concerns expressed in N1162, add the parenthetical annotations as below:
         01A2 LATIN CAPITAL LETTER OI (gha)
         01A3 LATIN SMALL LETTER OI (gha)
 
    d. With reference to N1129 on APL characters:
 
         234A APL FUNCTION SYMBOL DOWN TACK UNDERBAR*
         234E APL FUNCTION SYMBOL DOWN TACK JOT*
         2351 APL FUNCTION SYMBOL UP TACK OVERBAR*
         2355 APL FUNCTION SYMBOL UP TACK JOT*
         2361 APL FUNCTION SYMBOL UP TACK DIAERESIS*
 
       The parenthetical notations such as (name inconsistent with 22A5 UP TACK) proposed for 234A in N1129 are to be replaced with equivalent entries in the Annex on Additional Information on Characters for each of the above.

Per Michel Suignard (editor of ISO-10646):

The case is hopeless. The characters were created from two different sources with two name coneventions. The only thing we could do (after lengthy discussion in WG2) was to recognize that the tack use two conventions and leave these things are they are. In those cases it was recognized that it was more important to leave the shapes alone, and fixing the names would even introduce more confusion (and you know how people resist in general to name fixing even if they are wrong).

I would have to dig up in my WG2 archive to find more details, but looking at the WG2 10646 errate document (I think N1529 is the most recent) they don't show, so this means that the case was closed at it is. Maybe worth a note in the Unicode next edition.

And then again from Michel:

And this has been on Annex P of 10646 since then for these characters as well as U+2361.
<<
234A APL FUNCTIONAL SYMBOL DOWN TACK UNDERBAR
The relation between the name of this character and the orientation of the “tack” element in its graphical symbol is inconsistent
with that of other characters in this International Standard, such as: 22A4 DOWN TACK and 22A5 UP TACK
234E APL FUNCTIONAL SYMBOL DOWN TACK JOT
Information for the character at 234A applies.
2351 APL FUNCTIONAL SYMBOL UP TACK OVERBAR
Information for the character at 234A applies.
2355 APL FUNCTIONAL SYMBOL UP TACK JOT
Information for the character at 234A applies.
2361 APL FUNCTIONAL SYMBOL UP TACK DIAERESIS
Information for the character at 234A applies.
>> 
And as guessed before, the name issue result from conflicting sources and was captured too late before we could do anything about it.

 

One of the challenges of Unicode is that it has so many characters that is at times too hard to make everything completely consistent. Especially when characters are proposed by different sources or for different purposes or by parties who disagree with the best way to represent something. Usually these inconsistencies are unintentional but there are times that they seem to be intentional attempts to forward other agendas (I am being charitable in my description here of the seemingly machiavellian attempts of some to achieve their goals).

This post brought to you by U+234a, U+234e, U+2351, U+2355, and U+2361 -- as you might have guessed.


# Luc Cluitmans on 11 Jan 2005 2:53 AM:

Just a side note for those of you who, like me, do not have all fonts installed to view those weird citizens of Unicode land discussed in this blog. I just stumbled across a web page that allows you to see what those symbols actually are in case you only see the infamous 'open rectangle': fill in the appropriate unicode code in the url http://www.fileformat.info/info/unicode/char/2355/index.htm
(here prepared for '2355') and follow that link.

Disclaimer - I have no relationship to that web site, just a happy user.

# Michael Kaplan on 11 Jan 2005 10:40 AM:

In the interests of full information, both Michel Suignard and Mike Ksar work for Microsoft, in the same group (GIFT) as I do. :-)

# Michael Kaplan on 11 Jan 2005 10:48 AM:

What a great idea! I will add this link ro the site. Thanks, Luc!

go to newer or older post, or back to index or month or day