by Michael S. Kaplan, published on 2008/06/10 03:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2008/06/10/8588265.aspx
The problem that came in was an interesting one:
- Win32 application running on Windows XP (Hebrew language);
- Rich edit control in the application;
- Try to display the folder path in left - to - right (i.e. in the form of C:\<Hebrew folder name>.txt, thus it is meant to be an "LTR" chunk, independent of the rest of the surrounding information);
- Current UI language settings should be used;
- Application uses BIDI control characters LRE/PDF to specify text as embedded left-to-right;
When displaying the folder path, it does not show the text from left to right order.
Indeed.
Let's dig into this one a bit, shall we?
If you are like me and you have spidey senses about this kind of thing, they are probably tingling right now. You may recall posts like these ones:
but although there seem to be familiar issues, none of them are quite the same.
Let's look at UAX#9 to get the definition of these controls:
2.1 Explicit Directional Embedding
The following codes signal that a piece of text is to be treated as embedded. For example, an English quotation in the middle of an Arabic sentence could be marked as being embedded left-to-right text. If there were a Hebrew phrase in the middle of the English quotation, the that phrase could be marked as being embedded right-to-left. These codes allow for nested embeddings.
Abbr. Code Chart Name Description LRE U+202A LEFT-TO-RIGHT EMBEDDING Treat the following text as embedded left-to-right. RLE U+202B RIGHT-TO-LEFT EMBEDDING Treat the following text as embedded right-to-left. The effect of right-left line direction, for example, can be accomplished by embedding the text with RLE...PDF.
Okay, so much for homework. Let's get practical now!
On the whole I think we should try things out and make sure we can reproduce the issue.
We will take a nice string that meets the particular criteria, such as:
C:\שיקול דעת מוטעה.txt
Now any time someone talks about RichEdit, I like to start in Notepad, then a RichEdit control, then Word. So let's try without the U+202a (LEFT-TO-RIGHT EMBEDDING) and U+202c (POP DIRECTIONAL FORMATTING) first -- in both LTR and RTL contexts.
In Notepad:
and then a RichEdit control (I use WordPad here but you could choose any old RichEdit control):
and then in Word (I am using Word 2003 here):
Hmmm.... I see the problem here. If you put any kind of RIGHT-TO-LEFT-OSITY on top of the string, despair follows.
This does seem like a good time for those embedding characters!
So we'll use the same string but put U+202a (LEFT-TO-RIGHT EMBEDDING) prefixing the string and U+202c (POP DIRECTIONAL FORMATTING) suffixing it.
Again, first in Notepad:
and then WordPad to look at RichEdit:
and then finally in Word:
Damn.
It works fine in our EDIT control, but not in our RICHEDIT control or in Word.
What's up with that?
RichEdit expert Murray Sargent explained what is going on:
No RichEdit version supports LRE, RLE, and PDF. They’re on the wish list[...]
Ah, I guess that says it all. I took the liberty of asking him to add an entry to the wishlist on behalf of those of my readers who care about such things. :-)
Though it really is times like this that I find myself think all of the work in RichEdit and Word and other parts of Office to support math1 might have been more ideally preceded by finishing this important bit about support of bidirectional text2.
Luckily IE does the right thing here, both with Unicode and with its own dir tag that can be put places like the level of a paragraph or a span or a div. And these non-IE components for rich text might have support before we actually need complex mathematical operations to express how long we've been waiting for the support....
1 - First added to Unicode in UTR#25 in the end of August 2003.
2 - First added to Unicode in UAX #9 in the middle of August 1999.
This blog brought to you by U+202a, U+202b, and U+202c (aka LEFT-TO-RIGHT EMBEDDING, RIGHT-TO-LEFT EMBEDDING, and POP DIRECTIONAL FORMATTING).
# James on 10 Jun 2008 2:22 PM:
Hi Michael,
Thank you for the post. I am stuck on that. :-)
James
# David Weiss on 10 Jun 2008 4:56 PM:
Do your readers know what "שיקול דעת מוטעה" means? That is a very non-subtle hint about how you feel about RichEdit here.
# Michael S. Kaplan on 10 Jun 2008 5:22 PM:
It seemed mildly subtle for most readers, and both obvious snd funny for the rest. :-)