The broader issue here is called "text normalization" and AFAICT there isn't one solid open source tool that does it (which seems odd, given how many open-source implementations of TTS engines there are). If there is, let me know.
NVidia's NeMo text processing ostensibly does this task but doesn't, among other things, do the roman numerals part. https://github.com/NVIDIA/NeMo-text-processing/blob/main/tutorials/Text_(Inverse)_Normalization.ipynb
There this page that describes how it is done generally but without specific reference to implementations; as far as I can tell a lot of what is going on here is closed-source implementations that are nevertheless being described in academic-style papers. https://devopedia.org/text-normalization
I've also asked this question here (just now), might be worth watching for answers https://github.com/coqui-ai/TTS/discussions/2443