Reverse Geocoding is Hard
My wife and I run OpenBenches - a crowd-sourced database of nearly 40,000 memorial benches. Every bench is geo-tagged with a latitude and longitude. But how do you go from a string of digits to something human readable?
How do I turn -33.755780,150.603769
into "42 Wallaby Way, Sydney, Australia"?
Luckily, that's a (somewhat) solved problem. Services like OpenCage, StadiaMaps, OpenStreetMap, and Geocode.Earth all provide APIs which transform co-ordinates into addresses. Done! Let's go home.
Except… Not everywhere has an address. Some benches are in parks. They typically don't have a street number, but might have an interesting feature nearby to help with location. For example a statue or prominent landmark.
And… Not every address is relevant. Some benches are on streets. But we probably don't want to imply that the bench is inside or belongs to a specific nearby house.
Let's step back a bit. Why do we want to display a human-readable address?
We have two use-cases.
"As a visitor to the site, I want to:"
- Read a (rough) textual representation of where the bench is.
- Click on a component of the address to see all benches within that area.
The first is easy to explain:

The second is harder. Suppose a bench is in Wellington, New Zealand. We want to create a URl like openbenches.org/location/New Zealand/Wellington/. That way, users can click on the word "Wellington" and find all the benches nearby. A user can also manually edit that URl to increase or decrease precision.
Both of these are problems of precision.
Let's take a look at how one of the reverse geocoding services deals with transforming 51.476845,-0.295296
into an address:
Royal Botanic Gardens, Kew, Sandycombe Road, Kew, London Borough of Richmond upon Thames, London, Greater London, England, TW9 2EN, United Kingdom
That is too much address!
Yes, it is technically accurate. But it contains far too much detail for humans, the postcode is irrelevant, and the weird-subdivisions are nothing that a local person would use.
Looking at the full API response, we can see:
JSON
{ "place_id": 258770727, "licence": "Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright", "name": "Royal Botanic Gardens, Kew", "display_name": "Royal Botanic Gardens, Kew, Elizabeth Cottages, Kew, London Borough of Richmond upon Thames, London, Greater London, England, TW9 3NJ, United Kingdom", "address": { "leisure": "Royal Botanic Gardens, Kew", "road": "Elizabeth Cottages", "suburb": "Kew", "city_district": "London Borough of Richmond upon Thames", "ISO3166-2-lvl8": "GB-RIC", "city": "London", "state_district": "Greater London", "state": "England", "ISO3166-2-lvl4": "GB-ENG", "postcode": "TW9 3NJ", "country": "United Kingdom", "country_code": "gb" } }
Aha! Perhaps I can build a better address using just those components!
Except… Not every country has states. And not all states are used when giving addresses. Not every location is in a city. Some places have villages, prefectures, municipalities, and hamlets.
New York, New York is a valid address, but Berlin, Berlin is not!
There's an address formatter by OpenCage which is pretty sensible about stripping off irrelevant details. But, to go back to my first point, not every map location on OpenBenches is a street address and - even if it is on a street - it probably shouldn't have a house number.
Well, there's kind of a solution to that! Most mapping provider have a POI function - we can find nearby things of interest and use them as a location.
Here's a bench in Cook County, Illinois, USA. The POI address is:
JSON
{ … "name": "Central Park", "coarse_location": "Des Plaines, IL, USA", … }
I assume there's only one Central Park in Des Plaines. Do people know that "Il" is Illinois? Would "Cook County" be useful?
On the subject of localisation, not everywhere speaks English. Do I want to display addresses like "原爆の子の像, 広島, 日本"? How about "原爆の子の像, Hiroshima, Japan"?
We're an international site, but most benches are in Anglophone countries.
Of course, just because something is physically near a POI, that doesn't mean it is logically close to it.
Consider a bench situated at the edge of this park
The nearest POI is "Gay's Creamery" - across the river. Is that what you'd expect? Is there any way to easily say "if a point is inside an amenity* then use that as the address?
I don't want the users of our site to have to select from a list of POIs or addresses, this should be as automated as possible.
The Plan
For each bench:
- Use StadiaMaps to get the nearest POI.
- Get the data in English.
- Concatenate the name and coarse location.
- Save the "address".
- Wait for complaints?
Thoughts?
@Edent If you can do the geolocation in realtime when I'm uploading a picture, show the results and ask "Is this location good enough?" with a Yes/No/Edit option, that would work, wouldn't it?
Reply to original comment on mastodon.me.uk
|@NAB I could. But I'm wary of making people do too much hard work. And I don't want people to be able to accidentally or maliciously change the address.
Perhaps a drop down is the answer?
Reply to original comment on mastodon.social
|@Edent Hmm, wonder if you could compare nearest POI with bench address, only use POI if it has enough levels of address detail in common
Reply to original comment on crispsandwi.ch
|Something like: OBN-XXXX.
I'll post here when i have something. Just got the european patent, finally.
More comments on Mastodon.