A quick look inside the HSTS file


You type in to your browser's address bar example.com and it automatically redirects you to the https:// version. How does your browser know that it needed to request the more secure version of a website?

The answer is... A big list. The HTTP Strict Transport Security (HSTS) list is a list of domain names which have told Google that they always want their website served over https. If the user tries to manually request the insecure version, the browser won't let them. This means that a user's connection to, for example, their bank cannot be hijacked. A dodgy WiFi network cannot force the user to visit an insecure and fraudulent version of a site.

After about a decade of use, the list is now 14MB in size, with around 130,000 entries in it. You can view the list online or download it.

The format is relatively straightforward:

JSON JSON{
 "name": "example.com",
 "policy": "bulk-1-year",
 "mode": "force-https",
 "include_subdomains": true
},

When the list is updated, Chrome creates a trie with Huffman coding compression - so it doesn't have to parse that monster file each time.

A rummage inside

The most popular (over 1,000 entries) TLDs / Public Suffixes are:

Rank TLD Entries
1 com 43,236
2 tk 19,022
3 de 5,216
4 org 4,731
5 gov 4,507
6 net 4,410
7 ga 4,326
8 nl 2,671
9 cf 2,458
10 ml 2,271
11 co.uk 2,139
12 fr 1,714
13 ru 1,516
14 eu 1,283
15 com.br 1,226
16 gq 1,225
17 io 1,215
18 com.au 1,202
19 it 1,103
20 cz 1,004

After .com, the free .tk domain names absolutely dominate. I wonder how many of them are fraudulent?

There are 2,676 .uk domain names - only 537 of which aren't on .co.uk.

Going a bit further, there are 418 IDNs (which start with xn--).

And about 187 have "porn" in the domain.

You can't really extrapolate much from this as a data set. Lots of the domains seem to have expired or otherwise no longer work. Reading around https://hstspreload.org it notes that because this list is hard-coded into Chrome it can take months before a site is added. Similarly, removal can take a long time as well.

I can't help feeling that there should be a better way to manage all this though.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

3 thoughts on “A quick look inside the HSTS file”

  1. Lee Maguire says:

    With regard to there being a better way of managing it - being an out-of-channel direction to try https instead of http - this is now mostly covered by RFC 9460. The use of an HTTPS RR for a domain conveys the same logic as HSTS. (Pending support being rolled out by DNS providers.)

    (It's not treated by browsers as fully equivalent to preload, since a failure to connect may produce the option to try http. But it should cover the risk scenarios for the vast majority of sites in the current preload list.)

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">