Bluetooth MAC, K-Anonymity, and Population Privacy
I recently went to a university hackathon, where students were trying to invent novel ways to help prevent pandemics. This was purely an academic exercise - they were not developing a fully-fledged app, nor were they creating official policies.
I spent some time with one group discussing the privacy implications of what they had built.
Thesis
By monitoring nearby Bluetooth devices, we can tell who has come in to contact with an infectious person.
We can warn people that they may have been exposed, and request that they seek treatment.


Background
Every Bluetooth device has a unique identifier called a MAC address. This ID is a 48 bit serial number in EUI-48 format.
For example: AB:CD:EF:12:34:56
The first 6 bytes of the MAC tell you the manufacturer of the device. These are public records, so you can easily see that a device near to you is, for example, an iPhone or if it is made by Cisco.
Searching
There are loads of apps which will show you every Bluetooth device your phone can see. You can see the signal strength of that device which roughly correlates to distance.
You can also see the name of the device. It might be generic - "Bose Headphones" or it may be specific "Jo Smith's iPhone 7".
Use
Imagine you had a similar app running continuously on your phone. It would record the Bluetooth MAC of any device which was close to you for more than, say, 15 minutes.
If, later, it was revealed that the owner of a device was infectious - you could be alerted. The hospital could upload the patient's phone's MAC onto a server. Then the server would alert people who had been close to that person while they were infectious.
Privacy
There are a few ways this sort of system could work. Let's ignore (!) privacy for now.
- Upload a list of everyone you've seen to a central database. Or...
- Send everyone the MAC of an infectious person.
Neither of these are great, are they? I don't feel comfortable sharing a list of my contacts to a central agency. That feels like a gross invasion of privacy.
Similarly, we don't want to send everyone in the world an easily-identifiable MAC which could expose the identity of a patient.
K-Anonymity
Enter the magical mathematics of K-Anonymity. Here's a brief and incomplete explanation:
- We identify an infected person and request their MAC address.
- The server splits the MAC into two pieces.
- The server send everyone the first piece.
- If your phone has seen the first piece, send the second piece to the server.
- You may have seen multiple devices with the first piece, so send all of the relevant 2nd pieces.
- If your 2nd piece matches the one on the server, you may be infected and will be alerted.
There are all sorts of other privacy-protecting things we could do...
- Your phone could send random / misleading data back to the server in response to any query. That would prevent a central agency tracking individuals.
- Rather than two equal halves, the server could send the first quarter, and your device could respond with the last quarter.
- The data could be one-way hashed before sending in either direction.
Is this sufficiently private?
The whole range of 48 bits can be stored in 256TB. More storage than you have at home - but easily within the range of a well-financed organisation.
In reality, the address space is much less because not all addresses have been issued.
Using this system, how easy would it be to build up a database of everyone you have spent time with?
An open-source app could be audited to make sure that it wasn't recording and transmitting anything it shouldn't. But what's to stop the server going on a fishing expedition and continually pinging your phone with pieces of a MAC?
How do you ensure that people only install the official app and not something which steals personal information?
Is the central database of infected people's MACs a target for hackers? How could it be secured?
Are some MAC addresses so unique that even the first few bits are enough to reliably de-anonymise someone?
Is this politically acceptable?
And here's the kicker. Would you install tracking software on your phone? What if Facebook quietly switched on this feature? What about GDPR? What about...
I politely remind you that this was an idea born out of a 24 hour hackathon, by a small group of students. To my knowledge - this isn't something being developed for use.
But this sort of app could be built easily. And - if not designed correctly - it could be a privacy disaster.
Simon Farnsworth says: