The set of HTML
<ruby> elements allow us to add pronunciation above text. For example:
"When you visit the zoo, be sure to see the panda - 熊猫."
This is written as:
That is, the word or character which needs text above it is wrapped in
<ruby>. The pronunciation is wrapped in
<rp> element indicates the presence of a parenthesis - which isn't usually displayed, but will be shown if the browser doesn't support
That's fairly easy for scripts written left-to-right. But how does it work for scripts like Arabic where the text is written right-to-left, but the user may want the pronunciations left-to-right?
Let's take the phrase "Hello World" in Arabic: مرحبا بالعالم. Google Translate tells me this is pronounced "marhaban bialealami".
For a single word, the directionality can be ignored. The browser should be smart enough to place the pronunciation above the word:
<p>Hello is: <ruby>مرحبا<rp>(</rp><rt>marhaban</rt><rp>)</rp></ruby>. What a useful word!</p>
Hello is: مرحبا. What a useful word!
What about if we have a few words - or a whole sentence - which is entirely RTL?
<p dir="rtl">مرحبا بالعالم</p>
Is displayed aligned to the right side of the screen:
There are a few ways to add pronunciation.
The first is to write each word separately. For example
<ruby>1st word</ruby> <ruby>2nd word</ruby>. Obviously, this isn't normally how you'd write a RTL language! But it does work:
<p dir="rtl"><ruby>مرحبا<rp>(</rp><rt>marhaban</rt><rp>)</rp></ruby> <ruby>بالعالم<rp>(</rp><rt>bialealami</rt><rp>)</rp></ruby></p>
Which displays as:
It helps to think of the way the characters of the script are stored in memory.
A word that displays as
ABC is stored as
So the above is written "correctly" - even though it looks odd in the source-code view.
But there is an alternative if you want the source text to look natural - i.e.
[2nd word] [1st word].
It's a bit messy, but you can write the LTR text in
<p dir="rtl"><ruby>مرحبا بالعالم<rt>bialealami marhaban</rt></ruby></p>
But, again, that doesn't seem very satisfying! It also divorces the pronunciation from the original word - which is unfortunate for screenreaders.
The Ruby layout algorithm is usually clever enough to group words separated by spaces:
Although, if the pronunciations have a significantly different length than each other, it can get a bit messy:
In which case, you probably need to go for the first technique and wrap each word in its own
It's tempting to think that simply using the
<bdo> element can help us here. It can't!
Using the bidirectional override will display characters RTL, rather than words.
<p dir="rtl"><ruby>مرحبا بالعالم<rt><bdo dir="rtl">marhaban bialealami</bdo></rt></ruby></p>
I guess you could spell each word backwards. Which would be extremely annoying for everyone and a complete nightmare for screen readers!
Instead, it can be fixed if each word is then given an explicit LTR direction:
<p dir="rtl"><ruby>مرحبا بالعالم<rt> <bdo dir="rtl"> <span dir="ltr">marhaban</span> <span dir="ltr">bialealami</span> </bdo></rt></ruby></p>
So, I think those are the only ways to achieving mixing bidirectional text pronunciation. But I'd welcome any corrections and suggestions!