Unicode Roman Numerals and Screen Readers

a11y accessibility Latin romans tts unicode · 13 comments · 800 words · Viewed ~3,327 times

How would you read this sentence out aloud?

"In Hamlet, Act Ⅳ, Scene Ⅸ..."

Most people with a grasp of the interplay between English and Latin would say "In Hamlet, Act four, scene nine". And they'd be right! But screen-readers - computer programs which convert text into speech - often get this wrong.

Why? Well, because I didn't just type "Uppercase Letter i, Uppercase Letter v". Instead, I used the Unicode symbol for the Roman numeral 4 - Ⅳ. And, it turns out, lots of screen-readers have a problem with those characters.

Don't Know Much About History

Unicode contains the range of Roman numbers from 1 - 10, plus a couple of compound numbers, 50, 100, 500, and 1000 - in a variety of forms.

Screenshot of a Table of Roman numerals in Unicode.

Why does Unicode contain these number which, to most people, are just squashed together Latin letter? As ever with Unicode, it is a mix of legacy and practicality.

The Unicode standard says:

Roman Numerals. For most purposes, it is preferable to compose the Roman numerals from sequences of the appropriate Latin letters. However, the uppercase and lowercase variants of the Roman numerals through 12, plus L, C, D, and M, have been encoded for compatibility with East Asian standards. Unlike sequences of Latin letters, these symbols remain upright in vertical layout. Additionally, in certain locales, compact date formats use Roman numerals for the month, but may expect the use of a single character.

Far be it for me to disagree with the learned authors of the spec, but I think they may have erred slightly on this one. While it may be preferable to re-use Latin letters, it leads to ambiguity which can be confusing for a screen-reader.

Practical Examples

Let's write out the numbers using regular letters. Suppose you were talking about "Romeo and Juliet, Act III, Scene I". Most screen readers will see the "III" and correctly speak aloud "Roman three" or similar. But when they get to the "I" it becomes ambiguous. Most will read out "Eye".

Screen-readers rarely look at the whole sentence for context. Which means they get confused. It's fairly obvious that XIV should be "fourteen" as there's no English word "xiv"⁰. But what about "MIX" - is that 1009 or the word "mix"?

Anyone who has watched the BBC knows about their fondness for displaying in Latin the year a programme was made. MCMXCVI is particularly challenging for a screen-reader!

Testing It

I took the following sample sentence - using both letters and Roman numerals.

Text. In Hamlet, Act I, Scene XI the year is MCMXCVI and they are watching Rocky V.

Roman. In Hamlet, Act Ⅰ, Scene Ⅺ the year is ⅯⅭⅯⅩⅭⅥ and they are watching Rocky Ⅴ.

Here's how various services coped:

Amazon Polly

First, the good news. Amazon's Polly read the Roman numerals perfectly. It even pronounced ⅯⅭⅯⅩⅭⅥ as "nineteen ninety six".

🔊

But it gets rather confused with the ambiguous English text.

Microsoft Edge Read Aloud

I tried with Microsoft Edge's Read Aloud TTS.

🔊

It and makes a bit of a hash of the English and just skips the Roman numerals.

Google Text To Speech

The same was also true with Google's TTS products.

🔊

Espeak NG

The venerable Linux utility came out with this.

🔊

It gets the "Capital i" incorrect, and reads the Roman numerals as their Unicode code points.

Jaws

My good friend Léonie Watson who writes extensively about accessibility was kind enough to record some other samples for me.

Here are Jaws' "Expressive":

🔊

And Jaws' "Eloquence:

🔊

I'm sure there's some obscure Scrabble word, but we're talking everyday use here. ↩︎

13 thoughts on “Unicode Roman Numerals and Screen Readers”

Robin Whittleton

@Edent the Z39.98 Daisy spec has semantics for roman characters: epub:type="z3998:roman". No idea if anything supports that though. https://daisy.org/activities/standards/daisy/z39-98-2012/z39-98-2012-structural-semantics-vocabulary/#roman Z39.98-2012 Structural Semantics Vocabulary - The DAISY Consortium

Reply | Reply to original comment on mastodon.social 2023-03-15 12:46
HackerNewsTop10

Unicode Roman Numerals and Screen ReadersLink: shkspr.mobi/blog/2023/03/u…Comments: news.ycombinator.com/item?id=351674…

Reply | Reply to original comment on twitter.com 2023-03-15 13:17
Lobsters

Unicode Roman Numerals and Screen Readerslobste.rs/s/sbvmkv #a11y #web shkspr.mobi/blog/2023/03/u…

Reply | Reply to original comment on twitter.com 2023-03-15 14:00
Šime Vidas

@Edent In Firefox’s reader view, some voices do read roman numerals, but as individual characters.

Reply | Reply to original comment on mastodon.social 2023-03-15 17:38
c.im

This Article was mentioned on c.im

Reply | Reply to original comment on c.im 2023-03-15 17:55
Chris Adams

@Edent I don't have a recording handy (construction work here is noisy) but using VoiceOver in Safari it reads them individually as “Roman numeral x” which works okay for everything but the date.

Reply | Reply to original comment on code4lib.social 2023-03-15 18:14
1. Gaelan
  
  Interesting, I just did right click > speech > start speaking in Safari and got different output: it got the latin "I" right, pronounced latin "XI" and "MCMXCVI" as words, and skipped over the roman characters. I guess VoiceOver uses a different code path than "standard" TTS?
  
  Reply 2023-03-15 23:39
James Powell

With minimal prompting, this is what ChatGPT gives…

How should a screen reader pronounce the following?

The musical score for Rocky IV was composed by Vince DiCola, who would later compose the music for The Transformers: The Movie. Rocky IV is the only film in the series not to feature original music by Bill Conti, who was replaced by DiCola; however, it does feature arrangements of themes composed by Conti from previous films in the series, such as "The Final Bell".

Chat GPT> A possible pronunciation of the text for a screen reader is:

"The musical score for Rocky IV was composed by Vince DiCola, who would later compose the music for The Transformers: The Movie. Rocky IV is the only film in the series not to feature original music by Bill Conti, who was replaced by DiCola; however, it does feature arrangements of themes composed by Conti from previous films in the series, such as 'The Final Bell'."

No, that is not correct. Please give guidance on how to pronounce each word aloud.

ChatGPT> I apologize for the confusion earlier. Here's a more detailed pronunciation guide for each word in the text:

The: pronounced "thuh" musical: pronounced "myoo-zi-kuhl" score: pronounced "skawr" for: pronounced "fawr" Rocky: pronounced "rah-kee" IV: pronounced "eye-vee"

…

No, that's not correct.

Correct the pronunciation of Roman numerals and use IPA.

ChatGPT> I apologize for my previous mistakes. Here's the corrected pronunciation guide with the Roman numerals represented in IPA symbols:

ðə ˈmjuzɪkəl skɔr fɔr ˈrɑki fɔr wəz kəmˈpoʊzd baɪ vɪns dɪˈkoʊlə hu wʊd ˈleɪtər kəmˈpoʊz ðə ˈmjuzɪk fɔr ði ˌtrænsfɔrmərz ðə ˈmuvi. ˈrɑki fɔr ɪz ði ˈoʊnli fɪlm ɪn ðə ˈsɪriz nɑt tuː ˈfiʧər əˈrɪdʒənəl ˈmjuzɪk baɪ bɪl ˈkɑnti hu wəz rɪˈpleɪst baɪ dɪˈkoʊlə; ˌhaʊˈɛvər, ɪt dəz ˈfiʧər əˈreɪnʤmənts əv θimz kəmˈpoʊzd baɪ ˈkɑnti frəm ˈpriviəs fɪlmz ɪn ðə ˈsɪriz, sʌʧ əz "ðə ˈfaɪnl bɛl".

I hope this corrected pronunciation guide helps.

Repeat the same for "In Hamlet, Act Ⅳ, Scene Ⅸ..."

ChatGPT> Certainly! Here's the pronunciation guide with the Roman numerals represented in IPA symbols for the sentence "In Hamlet, Act Ⅳ, Scene Ⅸ...":

ɪn ˈhæmlət, ækt fɔrθ, sin naɪn

I hope this helps!

Reply 2023-03-16 08:19
1. Joseph Morris
  
  If you ask it to normalize for TTS it works; this is with davinci-003
  
  normalize the following for text to speech: In Hamlet, Act I, Scene XI the year is MCMXCVI and they are watching Rocky V.
  
  In Hamlet, Act One, Scene Eleven, the year is Nineteen Ninety-Six, and they are watching Rocky Five.
  
  Reply 2023-03-21 18:01
Mike Gifford

ChromeVox - Didn't get the text MCMXCVI, but did manage to properly prounounce the others in the first line. It totally skipped up the Roman text.

I mentioned this article in the NVDA issue queue on the appropriate topic https://github.com/nvaccess/nvda/issues/12052

Reply 2023-03-18 20:55
Joseph Morris

The broader issue here is called "text normalization" and AFAICT there isn't one solid open source tool that does it (which seems odd, given how many open-source implementations of TTS engines there are). If there is, let me know.

NVidia's NeMo text processing ostensibly does this task but doesn't, among other things, do the roman numerals part. https://github.com/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb

There this page that describes how it is done generally but without specific reference to implementations; as far as I can tell a lot of what is going on here is closed-source implementations that are nevertheless being described in academic-style papers. https://devopedia.org/text-normalization

I've also asked this question here (just now), might be worth watching for answers https://github.com/coqui-ai/TTS/discussions/2443

Reply 2023-03-21 18:56
mina86

Debian Forky, Firefox 140.8, Orca 49.5, AT-SPI2 2.59.1 with default settings, reads both lines the same way (audio):

In Hamlet, act /eye/, scene Roman eleven, the year is <undecipherable gibberish> and they are watching Rocky /vee/.

What I dislike about Unicode Roman numerals (and in fact many other Unicode characters) is that they have to be explicitly supported by the font. IMO font render engines should understand that if the font doesn’t have a glyph for ‘Ⅺ’ it should render it as ‘XI’ rather than a nodef character or falling back to a different font.

Reply 2026-03-18 13:50
More comments on Mastodon.