A quick look inside the HSTS file


You type in to your browser's address bar example.com and it automatically redirects you to the https:// version. How does your browser know that it needed to request the more secure version of a website?

The answer is... A big list. The HTTP Strict Transport Security (HSTS) list is a list of domain names which have told Google that they always want their website served over https. If the user tries to manually request the insecure version, the browser won't let them. This means that a user's connection to, for example, their bank cannot be hijacked. A dodgy WiFi network cannot force the user to visit an insecure and fraudulent version of a site.

After about a decade of use, the list is now 14MB in size, with around 130,000 entries in it. You can view the list online or download it.

The format is relatively straightforward:

{
 "name": "example.com",
 "policy": "bulk-1-year",
 "mode": "force-https",
 "include_subdomains": true
},

When the list is updated, Chrome creates a trie with Huffman coding compression - so it doesn't have to parse that monster file each time.

A rummage inside

The most popular (over 1,000 entries) TLDs / Public Suffixes are:

RankTLDEntries
1com43,236
2tk19,022
3de5,216
4org4,731
5gov4,507
6net4,410
7ga4,326
8nl2,671
9cf2,458
10ml2,271
11co.uk2,139
12fr1,714
13ru1,516
14eu1,283
15com.br1,226
16gq1,225
17io1,215
18com.au1,202
19it1,103
20cz1,004

After .com, the free .tk domain names absolutely dominate. I wonder how many of them are fraudulent?

There are 2,676 .uk domain names - only 537 of which aren't on .co.uk.

Going a bit further, there are 418 IDNs (which start with xn--).

And about 187 have "porn" in the domain.

You can't really extrapolate much from this as a data set. Lots of the domains seem to have expired or otherwise no longer work. Reading around https://hstspreload.org it notes that because this list is hard-coded into Chrome it can take months before a site is added. Similarly, removal can take a long time as well.

I can't help feeling that there should be a better way to manage all this though.


Share this post on…

3 thoughts on “A quick look inside the HSTS file”

  1. Lee Maguire says:

    With regard to there being a better way of managing it - being an out-of-channel direction to try https instead of http - this is now mostly covered by RFC 9460. The use of an HTTPS RR for a domain conveys the same logic as HSTS. (Pending support being rolled out by DNS providers.)

    (It's not treated by browsers as fully equivalent to preload, since a failure to connect may produce the option to try http. But it should cover the risk scenarios for the vast majority of sites in the current preload list.)

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">