Premature Subsetting of Web Fonts


If you thought Web Fonts were pretty nifty, then you're going to think font subsetting is really cool. No, honestly! It is! As I've written about before you can dramatically reduce the size of your Web Fonts by cutting out characters that you don't need.

For example, suppose you don't need to include the русский алфавит - you can immediately drop 66 letters (upper- and lower-case), a whole load of accents, and a bunch of other Cyrillic stuff. That's up to 400 fewer characters - a huge space saving.

Here's how the Font Squirrel Webfont Generator handles subsetting: Font Subsetting-fs8

Depending on the font you start with and the characters you anticipate you'll need, you can easily optimize your webfont size more than any Brotli compression could hope to manage.

Now, what do all good computer programmers know about Premature Optimization?

So, some clever web designers have realised that they can eke out a few more microseconds of performance if they drop all of the weird characters from their fancy web fonts and stick to good ol' fashioned US-ASCII.

Which has this unfortunate side effect.

Premature Subsetting 2 Premature Subsetting

As you can see from these two examples, there's a designer somewhere wailing and gnashing their teeth at the sight of these abominations. All that hard work - and cost - of choosing the right font ruined, because someone thought they'd drop support for European accents.

Is this worth it? Let's take a look at the New Athena Unicode font as a good example.

New Athena Unicode is a freeware multilingual font distributed by the American Philological Association. It follows the latest version of the Unicode standard and includes characters for English and Western European languages, polytonic Greek, Coptic, Old Italic, and Demotic Egyptian (and Arabic) transliteration, as well as metrical symbols and other characters used by classical scholars.

The font contains 1825 glyphs - the TTF is 690KB. Once converted into the heavily compressed WOFF2 format, it's 149KB. Using the FontSquirrel subsetter to get just the common English language characters results in a WOFF2 font of 17KB. Tiny!

Ok, that's a pretty decent chunk saved - but at the expense of the most European languages. Adding back in French, German, Italian, Polish, and the Latin set takes the font up to 35KB. Yes, the size has doubled - but I'd argue it's now several times more useful

Be smart. Do you really need extended punctuation like ‰‱′″‴‵‶‷‸‹››※‼‽‾‿⁀⁁⁂⁃⁄⁅⁆⁇⁈⁉⁊⁋⁌⁍⁎⁏⁐⁑⁒⁓⁔⁕⁖⁗⁘⁙⁚⁛⁜?

No, probably not. Strip it out. But take a look at what you do use. There's no "correct" answer for this, by the way. If you're confident that your reporters are never going to interview someone with a Polish name - leave out those letters. Or, have your database regenerate fonts on the fly when it receives a previously unseen character.

But please don't pretend that these "unusual" characters don't exist! You're publishing on the World Wide Web - act like it.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">