Why are QR Codes with capital letters smaller than QR codes with lower-case letters?


Take a look at these two QR codes. Scan them if you like, I promise there's nothing dodgy in them.


QR CODE   QR Code.


Left is upper-case HTTPS://EDENT.TEL/ and right is lower-case https://edent.tel/

You can clearly see that the one on the left is a "smaller" QR as it has fewer bits of data in it. Both go to the same URl, the only difference is the casing.

What's going on?

Your first thought might be that there's a different level of error-correction. QR codes can have increasing levels of redundancy in order to make sure they can be scanned when damaged. But, in this case, they both have Low error correction.

The smaller code is "Type 1" - it is 21px * 21px. The larger is "Type 2" with 25px * 25px.

The official specification describes the versions in more details. The smaller code should be able to hold 25 alphanumeric character. But https://edent.tel/ is only 18 characters long. So why is it bumped into a larger code?

Using a decoder like ZXING it is possible to see the raw bytes of each code.

UPPER

20 93 1a a6 54 63 dd 28   
35 1b 50 e9 3b dc 00 ec
11 ec 11

lower:

41 26 87 47 47 07 33 a2   
f2 f6 56 46 56 e7 42 e7
46 56 c2 f0 ec 11 ec 11  
ec 11 ec 11 ec 11 ec 11
ec 11

You might have noticed that they both end with the same sequence: ec 11 Those are "padding bytes" because the data needs to completely fill the QR code. But - hang on! - not only does the UPPER one safely contain the text, it also has some spare padding?

The answer lies in the first couple of bytes.

Once the raw bytes have been read, a QR scanner needs to know exactly what sort of code it is dealing with. The first four bits tell it the mode. Let's convert the hex to binary and then split after the first four bits:

Type HEX BIN Split
UPPER 20 93 00100000 10010011 0010 000010010011
lower 41 26 01000001 00100110 0100 000100100110

The UPPER code is 0010 which indicates it is Alphanumeric - the standard says the next 9 bits show the length of data.

The lower code is 0100 which indicates it is Byte mode - the standard says the next 8 bits show the length of data.

Type HEX BIN Split
UPPER 20 93 00100000 10010011 0010 0000 10010
lower 41 26 01000001 00100110 0100 000 10010

Look at that! They both have a length of 10010 which, converted to binary, is 18 - the exact length of the text.

Alphanumeric users 11 bits for every two characters, Byte mode uses (you guessed it!) 8 bits per single character.

But why is the lower-case code pushed into Byte mode? Isn't it using letters and number?

Well, yes. But in order to store data efficiently, Alphanumeric mode only has a limited subset of characters available. Upper-case letters, and a handful of punctuation symbols: space $ % * + - . / :

Luckily, that's enough for a protocol, domain, and path. Sadly, no GET parameters.

So, there you have it. If you want the smallest possible physical size for a QR code which contains a URl, make sure the text is all in capital letters.


This blog post was exhibited at QR Show, NYCA print out of the blog post pinned up in an exhibition space.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

19 thoughts on “Why are QR Codes with capital letters smaller than QR codes with lower-case letters?”

  1. said on webtoart.com:

    题图:通关了动物井,真他妈是个好游戏,好玩极了! 【影视】左右翼都是我的翅膀!本季最强日剧怎么就拍成了缝合怪? 最近很闲就补了一下去年的大热日剧,上面图里这个 VIVANT,虽然之后口碑崩盘(但是收视率还挺高),但是看了两集之后仍然感到:这破剧本什么破玩意。潘22老师这个吐槽视频比剧还好看,但是建议快进看完前四集再看吐槽视频 😂 【道理】 The More You Own, The More You Maintain If every new feature just meant one more thing to maintain, things might not be that bad. But ten design components don't create ten relationships,
    Reply | Reply to original comment on webtoart.com

Trackbacks and Pingbacks

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">