When Unicode's "PDF" character isn't supported, you really *can* say that the app's Bidi support doesn't POP!

by Michael S. Kaplan, published on 2008/06/10 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/06/10/8588265.aspx


The problem that came in was an interesting one:

When displaying the folder path, it does not show the text from left to right order.

Indeed.

Let's dig into this one a bit, shall we?

If you are like me and you have spidey senses about this kind of thing, they are probably tingling right now. You may recall posts like these ones:

but although there seem to be familiar issues, none of them are quite the same.

Let's look at UAX#9 to get the definition of these controls:

2.1 Explicit Directional Embedding

The following codes signal that a piece of text is to be treated as embedded. For example, an English quotation in the middle of an Arabic sentence could be marked as being embedded left-to-right text. If there were a Hebrew phrase in the middle of the English quotation, the that phrase could be marked as being embedded right-to-left. These codes allow for nested embeddings.

Abbr. Code Chart Name Description
LRE U+202A http://www.unicode.org/cgi-bin/refglyph?24-202A LEFT-TO-RIGHT EMBEDDING Treat the following text as embedded left-to-right.
RLE U+202B http://www.unicode.org/cgi-bin/refglyph?24-202B RIGHT-TO-LEFT EMBEDDING Treat the following text as embedded right-to-left.

The effect of right-left line direction, for example, can be accomplished by embedding the text with RLE...PDF.

Okay, so much for homework. Let's get practical now!

On the whole I think we should try things out and make sure we can reproduce the issue.

We will take a nice string that meets the particular criteria, such as:

C:\שיקול דעת מוטעה.txt

Now any time someone talks about RichEdit, I like to start in Notepad, then a RichEdit control, then Word. So let's try without the U+202a (LEFT-TO-RIGHT EMBEDDING) and U+202c (POP DIRECTIONAL FORMATTING) first -- in both LTR and RTL contexts.

In Notepad:


and then a RichEdit control (I use WordPad here but you could choose any old RichEdit control):


and then in Word (I am using Word 2003 here):


Hmmm.... I see the problem here. If you put any kind of RIGHT-TO-LEFT-OSITY on top of the string, despair follows.

This does seem like a good time for those embedding characters!

So we'll use the same string but put U+202a (LEFT-TO-RIGHT EMBEDDING) prefixing the string and U+202c (POP DIRECTIONAL FORMATTING) suffixing it.

Again, first in Notepad:


and then WordPad to look at RichEdit:


and then finally in Word:


Damn.

It works fine in our EDIT control, but not in our RICHEDIT control or in Word.

What's up with that?

RichEdit expert Murray Sargent explained what is going on:

No RichEdit version supports LRE, RLE, and PDF. They’re on the wish list[...]

Ah, I guess that says it all. I took the liberty of asking him to add an entry to the wishlist on behalf of those of my readers who care about such things. :-)

Though it really is times like this that I find myself think all of the work in RichEdit and Word and other parts of Office to support math1 might have been more ideally preceded by finishing this important bit about support of bidirectional text2.

Luckily IE does the right thing here, both with Unicode and with its own dir tag that can be put places like the level of a paragraph or a span or a div. And these non-IE components for rich text might have support before we actually need complex mathematical operations to express how long we've been waiting for the support....

 

1 - First added to Unicode in UTR#25 in the end of August 2003.
2 - First added to Unicode in UAX #9 in the middle of August 1999.

 

This blog brought to you by U+202a, U+202b, and U+202c (aka LEFT-TO-RIGHT EMBEDDING, RIGHT-TO-LEFT EMBEDDING, and POP DIRECTIONAL FORMATTING).


# James on 10 Jun 2008 2:22 PM:

Hi Michael,

Thank you for the post. I am stuck on that. :-)

James

# David Weiss on 10 Jun 2008 4:56 PM:

Do your readers know what "שיקול דעת מוטעה" means? That is a very non-subtle hint about how you feel about RichEdit here.

# Michael S. Kaplan on 10 Jun 2008 5:22 PM:

It seemed mildly subtle for most readers, and both obvious snd funny for the rest. :-)


go to newer or older post, or back to index or month or day