This blog post is designed to foster a technical and logistical discussion. In much the same way as the earlier QRpedia language discussion did.

One of the most requested features in QRpedia is to have custom URLs.

For example, the British Museum may want a URL of "bm.qrwp.org". This has two main advantages.

  1. Better analytics. Although the British Museum is the only place likely to have the Rosetta Stone, many museums will have exhibits about "Ancient Egypt" or "Gold". By differentiating museums, their statistics are easier to view.
  2. Branding opportunities. A user will know that they've scanned a code belong to a specific museum.

From a technical perspective, this is fairly easy to implement. Assuming that a museum is only generating codes in one language, we simply map $museum.qrwp to $language.qrwp - and record in the logging database as per usual.

However, there are a number of challenges around the naming of museums which means considerable thought is needed before we implement this.


QR codes work best when the URL inside them is as short as possible.

This means, we don't want a URL like "BritishMuseum.qrwp.org" or even "PrestongrangeIndustrialHeritageMuseum.qrwp.org".

So, we need to choose suitable abbreviations.

Language Clashes

We could create a custom URL for the British Museum of "bm". However, that's also the same language code as the Bambara language.

There are several Language Codes in use - covering two and three letter combinations. There are currently 282 different language versions of Wikipedia.

Those mostly use two or three letters to distinguish between languages - but there are the occasional surprise like "bat-smg"

Abbreviation Clashes

Suppose that the British Museum wanted a custom URL of "brit.qrwp.org" - that may clash with the (fictitious) Brazilian Institute for Technology.

We Need...

We need to meet these aims for custom URLs:

  1. Short
  2. Unique
  3. Recognisable
  4. Fairly distributed

How on Earth do we do that?

On your marks... Get set... Discuss!

  1. Language-code clashes could be avoided by using four-letter codes; for a small trade-off with length.

    Ambiguity could be dealt with on a first-come-first served basis; that should also provide an incentive for early adoption ;-)

    Are there any serves/ protocols which already use short UIDs for museums and galleries, i wonder?

  2. Ambiguity and first-past-the-post could be avoided in the UK simply by adopting the old mda codes which should already be familiar to most museums (http://collectionslink.org.uk/home/mda-codes). International ambiguity could be avoided by prefixing these with UK. Doesn't necessarily help museums outside the UK but it could go someway to avoiding the ridiculous scramble when .museum appeared.

  3. Some thought: Well the MDA codes for UK seem to be the heir apparent. Although I was amused to see the MLA comment on their site "Please note that an MDA Code is exclusive to the individual organisation, and should not be used for any other purpose." I have no idea what it is meant to mean - unless the Deal museum code of "DEATH" (really!) means that I cannot write this sentence unless it is under the threat of "cannot say this".

    As with all code systems we now have the problem of letting other countries join in (Add your own joke about the European Uniion here). Luckily language can be used to differentiate different museum systems except for "global" languages like English and French where it is unlikely that there are coding systems available to differentiate Australian from American museums. What might be worth knowing is if Europeana has a museum naming system which would cover 27 (26?) countries - but I suspect it may be overly complex.

    OR. If the museum code is unique then you may not need the language code as British Museum=English (for rare cases you could still allow the museum to specify an undefault language where they might have a special exhibition in Welsh). This would mean that the 5 character MLA code would in most cases be a 3 character addition.

    Other unique and short(ish) systems would be their primary IP address in Hex. Which would be 6-8 extra characters but would be international and scalable.

    So if we have a table that provides a look up from a registered IP address to a GLAM name and its default language then we could do this for an extra 6 characters (8 minus the 2/3 character language code). Does 6 extra characters to define every GLAM in the world seems O.K?

    Actually in 99.x% of cases we can get the GLAM name and their deafault language from their IP address...

  4. Oh and their logo (via favicon) and their email address. Actually for their main code we only need the 8 characters and qrwp.org to identify a museum website in c.17 characters, Thats a pretty tight QR code.

  5. Have you considered using the location (longitude/latiude)? You can either
    ask the phone it's location or include it in the url. Based on that you can
    easily derive where someone was standing. That way you don't have to keep an
    administration beforehand.

  6. At first sorry for my english. But I think for the QR in wikipedia could be the possibility to use for each article instead of the Lemma an unique ID. In the german wikipedia it's possible to use this so can e.g. instead of URLhttp://de.wikipedia.org/wiki/Wien the URL with ID
    http://de.wikipedia.org/w/index.php?curid=5632 - In the case of move of the article e.g. to Vienna (Austria) the ID would be same later too.

    The second cause to use the ID is the small URL with the small number, at this time till 6 digits) instead of a long lemma of atricle - so the QR-code will be verry smaller than with lemma.
    One database in Austria works in the meantime with these ID (you see it in Noe Landesmuseum with link to wikipedia) unfortanetely in german

    this would be only an idea to this matter
    thx [email protected]

    1. That's a good idea. The only issue is when a museum visitor scans the code, what will they see? A random string of numbers doesn't tell them if the code is for "Vienna", or "Salzburg" - so they won't know what they are getting.
      When they look at their QR scanner history later - they won't be able to tell which is which.

      I think it could be very useful when referring to long URLs like Donaudampfschiffahrtsgesellschaftskapitän!

  7. Not only for only links. You must think, that the article could get an other lemma throug a movement inside of system - I think so to names of people - at first it exist one article with the lemme e.g. (only virtual ;-) John Mayer - you make the code, but a half year later the article has the lemma John Mayer (author), because a second John mayer is an politician
    through move from John Mayer to John Mayer (author) has the same ID, while John Mayer is an disambiguation :this is not what I search :-(

    So the maintenance of Code-Labels will be verry fastener and easier, because reduced

    regards Karl

