In theory, you should be able to get the base favicon of any domain by calling /favicon.ico - but the reality is somewhat more complex than that. Plenty of sites use a wide variety of semi-standardised images which are usually only discoverable from the site's HTML.
There are several services which allow you to get favicons based on a domain. But they all have their problems.
-
Google
- Exposes your user's to Google's tracking.
- Relies on redirects.
-
DuckDuckGo
- Not officially supported by DDG.
-
Favicon.is
- No privacy policy whatsoever.
-
Icons.horse
- Paid service.
- Only small size icons.
-
Favicone
- No privacy policy.
- Only small size icons.
I want to show favicons next to specific links, but I don't want to expose my visitors to unnecessary tracking. How can I proxy these images so they are stored and served locally?
There are a few existing services. Some use Cloudflare workers or other cloud services, there are some local-first ones which are unmaintained. But nothing modern, self-hosted, and as easy to deploy as uploading a single PHP file.
So here's my attempt to make something which will preserve user privacy, be reasonably fast, and have moderately up-to-date icons, while remaining fast and efficient.
Getting the domain
Assuming the request comes in to https://proxy.example.com/?domain=bbc.co.uk
PHP has a handy FILTER_VALIDATE_DOMAIN filter which will determine if the string is a domain.
PHP
filter_var( $domain, FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME );
Dealing with IDNs
Some domains contain non-ASCII characters - for example https://莎士比亚.org/ - not all favicon services support International Domain Names.
Using the idn_to_ascii() function, it is possible to get the Punycode domain.
PHP
$domain = idn_to_ascii("莎士比亚.org");
Getting the image
- Check if the icon has previously been downloaded.
- Rotate randomly between a few different Favicon services.
- Download the icon.
- Save it somewhere.
Getting the structure right
I know from my work on OpenBenches that storing tens of thousands of files in a single directory can be problematic. So I'll store the retrieved favicon in: /tld/domain/subdomain/
That will make it quick to see if an icon exists. I'll save the file with a filename based on the current timestamp. That will allow me to check if an icon is out of date, and will prevent people downloading the icons directly from me.
Preventing abuse
I don't want anyone but visitors to my site to be able to use this service. So I'll add a (weak) check to see if the request came from my domain.
PHP
$referer = parse_url( $_SERVER["HTTP_REFERER"], PHP_URL_HOST ); if ( $referer == "shkspr.mobi") { … }
Some browsers may not send referers for privacy reasons. So they won't see the favicons. But they probably wouldn't have seen the images loaded from a 3rd party service. So I'll serve a default image.
Putting it all together
You can grab the code from my personal git service.
3 thoughts on “A Self-Hosted Favicon Proxy written in PHP”
@Edent @11ty has one of these too!
11ty.dev/docs/services/indieweb-avatar/
IndieWeb Avatar
| Reply to original comment on fediverse.zachleat.com
Provided the favicon service is running on the same domain, you can probably use the Sec-Fetch-Site header
Allowing anything except a "cross-site" value is equivalent to your current referrer check (but would work even if referrers aren't sent)
| Reply to original comment on bsky.app
@blog But can it parse the https://decentsoftwa.re/ favicon? 👹
| Reply to original comment on mastodon.xyz
More comments on Mastodon.