by Michael S. Kaplan, published on 2007/03/15 08:01 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/03/15/1886714.aspx
uberDave asked:
I'm making my first forays into the world of XAML/WPF, and I'm finding it's handling of bidirectional text to be rather confusing. For example, consider this HTML fragment:
Blah <span>קֵינוֹ</span> <span>123 ABC</span>
Because the Hebrew text here is "stronger" than the '123', the numbers will get rendered to the left of the Hebrew text. However, if I want the numbers to be rendered on the other side of the Hebrew text, I can do this easily enough by adding a dir attribute to the Hebrew span:
Blah <span dir='ltr'>קֵינוֹ</span> <span>123 ABC</span>
Unfortunately, the equivalent XAML (inside a FlowDocument/Paragraph, anyway) seems to deviate from this. Specifically,
Blah <Span>קֵינוֹ</Span> <Span>123 ABC</Span>
renders identically to
Blah <Span FlowDirection="LeftToRight">קֵינוֹ</Span> <Span>123 ABC</Span>
Incidentally, if you copy the text from any of these examples from IE/XAMLPad and paste them into Word 2003, the numbers always get rendered to the right of the Hebrew text.
So I've got two questions:
1) Is the XAML behavior a feature or a bug?
2) Is there a workaround available so that I can put the numbers where I want them and still have logical text selection? I was able to fix the rendering issue by putting the Hebrew text in a TextBlock element, but this prevents text selection at the character level, which I would like to have available.
Well, it is impossible to get everyone look at the behavior as a bug, since clearly all you have to do is add the notion of a higher level protocol giving some kind of directionality to the mix, and then you can consider the behavior to be conformant.
There is at least one regular reader who hates this notion, and I can't claim to be thrilled to death about it, either. The whole notion of trying to shove something as complex as the Unicode Bidi algorithm into any kind of markup environment where arbitrary sections of text are given a directionality that makes no sense (e.g. that LTR Hebrew span) is just weird. The WPF/Avalon behavior is both by design and a bug, in my opinion. Though no worse than pushing an popping strong directional formatting via the various Bidi Cf characters.,
The behavior in every case can be affected by this notion, since deciding what each case looks like can be influenced by the notion of the overall directionality going into the text.
Let's look to see what happens to the first two examples here in this browser, an arguably left-to-right kind of an environment here in the blog:
Blah קֵינוֹ 123 ABC
Blah קֵינוֹ 123 ABC
In each case, the logical order is actually unchanged. If one uses selection to look at that logical order:
Notice how the logical order never really changed no matter what kind of games were being played with directionality? The Hebrew, which is there first logically, is always first before the numbers. That behavior kind makes sense to me (though the whole LTR Hebrew thing does not), which might tend to make the ones that behave differently seem like bugs. Though the text sees wrong to me anyway so what is proven here? What is the strength of preserving a logical order that has nothing to do with the visual one? Maybe it helps searching while allowing bogus displays of text, but that hardly seems like a virtue.
In the end, what I hate most about the Bidi algorithm is its intricate rules that are designed around weird edge cases that may or may not be valid, with very little in the way of genuine strings to prove the need to define all of the strange cases. Perhaps I am alone in feeling this way, but if one were forced to come up with actual text and a need to display it in a particular way, the whole thing would not only be simpler and easier to implement, but it would also end up seeming a lot more intuitive.
So the real notion of an unchanged logical order is obfuscated by all of their higher level protocol rules of spans and their directionality. To allow complex display to happen that still does not chaneg the logical order -- something that is ultimately not very intuitive.
That's just my opinion, I am sure there are people who will think me wrong in this regard.... :-)
This post brought to you by ק (U+05e7, a.k.a. HEBREW LETTER QOF)