EBCDIC is incompatible with GDPR

gdpr name unicode · 33 comments · 800 words · Viewed ~44,067 times

Welcome to acronym city!

The Court of Appeal of Brussels has made an interesting ruling. A customer complained that their bank was spelling the customer's name incorrectly. The bank didn't have support for diacritical marks. Things like á, è, ô, ü, ç etc. Those accents are common in many languages. So it was a little surprising that the bank didn't support them.

The bank refused to spell their customer's name correctly, so the customer raised a GDPR complaint under Article 16.

The data subject shall have the right to obtain from the controller without undue delay the rectification of inaccurate personal data concerning him or her.

Cue much legal back and forth. The bank argued that they simply couldn't support diacritics due to their technology stack. Here's their argument (in Dutch - my translation follows)

Bank X also explained that the current customer data management application was launched in 1995 and is still running on a US manufactured mainframe system. This system only supported EBCDIC ("extended binary-coded decimal interchange code"). This is an 8-bit standard for storing letters and punctuation marks, developed in 1963-1964 by IBM for their mainframes and AS/400 computers. The code comes from of the use of punch cards and only contains the following characters…

(Emphasis added.)

EBCDIC is an ancient (and much hated) "standard" which should have been fired into the sun a long time ago. It baffles me that it was still being used in 1995 - let alone today.

Look, I'm not a lawyer (sorry mum!) so I've no idea whether this sort of ruling has any impact outside of this specific case. But, a decade after the seminal Falsehoods Programmers Believe About Names essay - we shouldn't tolerate these sorts of flaws.

Unicode - encoded as UTF-8 - just works. Yes, I'm sure there are some edge-cases. But if you can't properly store human names in their native language, you're opening yourself up to a lawsuit.

33 thoughts on “EBCDIC is incompatible with GDPR”

Simon Vans-Colina

I’ve been saying this for years, including at the @bankofengland while they were speccing out RT2.

ASCII (and EBCDIC) is racism.

Just because legacy 🦕banks can’t handle UTF-8 and 24x7 RTGS doesn’t make it right.

Reply | Reply to original comment on twitter.com 2021-10-25 13:37
Andy Mabbett

Now all you need to do is change your name by deed poll [1], to "⏽⏻ r⏻n⏾⏻ E⏼⏻⭘" and you can force everyone [2] to use your favourite Unicode symbols!

[1] Deed pool not always required: https://en.wikipedia.org/wiki/Deed_of_change_of_name

[2] Well, your suppliers.

Reply 2021-10-25 16:13
1. Lee Willy Minifees
  
  As far as I know, most countries have laws that regulate what you can put into your legal name.
  
  Reply 2023-10-25 11:42
Jim Rees

EBCDIC has many code pages, just like DOS, and by selecting the correct one you can encode characters from any European language you want. So the bank's argument is not completely correct.

Reply 2021-10-25 17:53
1. Dror Harari
  
  Correct but given that there are EBCDIC code page for every country which are not consistent (even the encoding of simple characters like $ may change from one country to another), this prevents a central application from supporting multiple code pages (sets of characters). You would need to store, along with the name, the code page that is used and then add program code to deal with that, something that is not practical.
  
  Suing for this under GDPR makes zero sense. If your bank is an ancient dinosaur, switch bank.
  
  Reply 2023-10-25 16:06
  1. @edent
    
    "If you're being discriminated against, just take some time, money, and effort go where you won't be."
    
    How about "Not supporting a diverse range of customers doesn't make sense. If you can't do that, shut down your organisation."?
    
    Reply 2023-10-25 16:29
JohnH

Specifically, EBCDIC Code page 37 has all the Latin-1 characters.

https://en.wikipedia.org/wiki/Code_page_37

(I worked on software on AS400 that supported multiples of these codepages. Eventually, tho, we just when to using Unicode back in 1999.)

Reply 2021-10-25 19:44
Jan

I‘m happy. It feels like revenge served very cold. I tried to open a Barclays account in 2006 and have a German last name with an ö. The Lady at the bank said she had to spell the name exactly as on my id. I said, use an ö. She said I don’t have one on my keyboard. I said then use oe instead. She said she couldn’t, because she had to spell it exactly like it was on my id. And on and on.

Reply 2021-10-25 20:21
1. JuggleT
  
  if it is a german id just show the machine readable part there the name is written with ae, oe, ue or ss
  
  Reply 2021-10-26 10:02
  1. Jan
    
    Didn’t know, thanks! It’s 15 years ago, so I doubt Incan still find her…
    
    Reply 2021-10-26 19:55
  2. Erkin Alp Güney
    
    Same in Turkish IDs. Machine readable portion spells my last name as Gueney.
    
    Reply 2023-10-26 18:50
mauvedeity

Wow. This is mad bonkers, and I shall be raising this with several places that can’t get my name right forthwith!

Reply 2021-10-25 21:35
Jan (2)

"Unicode - encoded as UTF-8 - just works. Yes, I'm sure there are some edge-cases. But if you can't properly store human names in their native language, you're opening yourself up to a lawsuit."

Those edge cases are for a large part in human names. There are rare Chinese characters that are not in unicode, those are rare because they are only used in a few names. And one can question if a language like Chinese with a long tail of very rare characters is not effectively an open-ended set. Someone invented those characters in the past, so why won't that process continue?

All of that is not really relevant to the legal question as judges tend to take into account what is reasonable in the current day and age, which according to this court is to support at least accents.

Reply 2021-10-25 22:23
1. Erkin Alp Güney
  
  In Chinese, you could at least use a combining backspace to split characters into two existing ones and thus overtype two characters together.
  
  Reply 2023-10-26 18:53
Christopher Lord

This is not a technical limitation — come up with an encoding just like UTF-8. Encode where possible in EBCDIC, but choose a bit to indicate higher chunks are available. Migrate legacy data to the new encoding, keeping an eye out for corner cases. Tricky bit is that these old bank systems tend to have fixed-width fields, which can mess with multi-byte encodings. I did something like this back when I worked on compilers for IBM as a work-around for our test suites sometimes having utf-8 filenames. Fairly easy to make a idempotent transformation. I should have gone full into consulting! sheesh.

Reply 2021-10-26 00:16
Karl Williamson

UTF-EBCDIC allows encoding all Unicode code points, similarly to UTF-8. https://www.unicode.org/reports/tr16/tr16-8.html There are modern Perl 5 releases available that support this which I run on z/OS; Python also is claimed to support EBCDIC, but I don't have experience with it regarding Unicode.

Both EBCDIC 1047 and 037 code pages are isomorphic to Latin1. Almost all European languages should be directly encodable via these.

Reply 2021-10-26 02:04
Ryan

wait which guitar tuning is EBCDIC

Reply | Reply to original comment on twitter.com 2021-10-26 03:50
feeder of cats ''' neurodiverse they/them

This is interesting not only for the tech implications but also: Can people whose gender is neither male nor female leverage this to get gender markers, honorifics, etc. corrected? 🤔 shkspr.mobi/blog/2021/10/e…

Reply | Reply to original comment on twitter.com 2021-10-26 06:41
Fefes Blog

This Article was mentioned on blog.fefe.de

Reply | Reply to original comment on blog.fefe.de 2021-10-26 06:53
reddit programming

EBCDIC is incompatible with GDPR shkspr.mobi/blog/2021/10/e… /post reddit.com/r/programming/…

Reply | Reply to original comment on twitter.com 2021-10-26 07:14
PUNii 💉💉

EBCDIC🥴

shkspr.mobi/blog/2021/10/e…

Reply | Reply to original comment on twitter.com 2021-10-26 07:37
Dave Cridland

The bank could just use punycode in EBDIC of course. Just try saying that out loud author throwing up a bit.

Reply 2021-10-26 08:15
Petru Ratiu

As someone with diacritics in my name, haha, yes.

shkspr.mobi/blog/2021/10/e…

Reply | Reply to original comment on twitter.com 2021-10-26 08:16
Blair Wyman

A point worth mentioning, IMHO, is that this banking application was apparently designed and written in the 1990's, and has been serving its intended purpose for almost 30 years.

If the Y2K or Euro character events did not break it -- and I have no reason to suspect that -- this application may theoretically be unchanged since the day it was written.

Is that a Good Thing? ...or a Bad Thing? I dunno. I just know it is a Thing. it is a Thing.

Reply 2021-10-26 16:11
Timothy

Jim Rees and others are correct, and the headline is incorrect. EBCDIC isn't the culprit. EBCDIC has had codepages for eons, and that'd be one classic way the bank could solve this problem -- or should have solved this problem decades ago. It's a well solved problem. Another way, probably better nowadays, is to use Unicode (UTF-8 probably). Whether it's IBM Z or IBM i, these systems definitely support Unicode and have since the 1990s. The implementation could be in hybrid-quick-hacky fashion. For example, put some "trigger/escape code" in the existing name field (with the current not great EBCDIC codepage choice) that then points to a UTF-8 encoded name stored alongside. It'd require an application code change, sure, but it's not rocket science actually.

Here's the real headline: "Bank that won't change anything is incompatible with the GDPR."

Reply 2021-10-28 10:11
José Ramírez

JCS wrote: No one—no one—is going to be confused when they see “Jose Ramirez” instead of “José Ramírez” but the lawyers among us still think it’s a critical issue. This is the very apotheosis of a first world problem.

Maybe you dont care, but I do!

Reply 2023-10-25 12:45
Charlie Stross

@blog Update: it turns out that EBCDIC supports code pages INCLUDING one with all the diacritical marks the bank claimed it was impossible to support! It's been available since an update to the standard in the mid-1980s! (EBCDIC code page 435.)

There's also UTF-EBCDIC, which allows EBCDIC to encode all valid Unicode character code points—over a million of them.

Verdict: the bank in question has an incompetent IT department.

Reply | Reply to original comment on wandering.shop 2024-04-28 15:22
news.ycombinator.com

EBCDIC Is Incompatible with GDPR | Hacker News

Reply | Reply to original comment on news.ycombinator.com 2025-06-11 10:15
news.ycombinator.com

EBCDIC is incompatible with GDPR | Hacker News

Reply | Reply to original comment on news.ycombinator.com 2025-06-11 10:16

Trackbacks and Pingbacks

アクセント付きアルファベットの名前を登録できないEUの銀行はGDPR違反 – 秋元@サイボウズラボ・プログラマー・ブログ

EU 内の銀行で、氏名のアルファベットにグレーブ(á)とかウムラウト(ü)とか、アクセント記号がついている人が本名を登録できない、という問題に対する訴訟があり、アクセント記号が登録できないのはGDPR(一般データ保護規則)違反である、という判決が2019年に出ていたそうです。テレンス・エデン氏(Terence Eden)のブログによれば、銀行がアクセント記号つきの氏名登録を拒絶したのに対し、この顧客はEU一般データ保護規則(GDPR)の16条「訂正の権利」を根拠に訴えるに至ったそうです。この16条は以下のようなもの。

The data subject shall have the right to obtain from the controller without undue delay the rectification of inaccurate personal data concerning him or her. Taking into account the purposes of the processing, the data subject shall have the right to have incomplete personal data completed, including by means of providing a supplementary statement.データ主体は、管理者から、不当に遅滞することなく、自己と関係する不正確な個人データの訂正を得る権利を有する。取扱いの目的を考慮に入れた上で、データ主体は、補足の陳述を提供する方法による場合を含め、不完全な個人データを完全なものとさせる権利を有する。一般データ保護規則(仮訳)

これに対し銀行は、1995年に開発された米国製メインフレーム上のアプリを使っており、このシステムが1964-65 に制定されたEBCDICコード表を使っていることから技術的に対応不可能なのだ、と弁明していたということ。実際のところ、EBCDIC にもCode page 37などアクセント記号に対応したセットがあり、これを使うように設計しなかったシステム設計の問題のようですが、EU内の人の移動が今ほど活発でなかった時代、アクセント記号を使わない国の銀行では少数の外国名顧客のことまで考えていなかったのでしょうね。本来の正しい文字で氏名を登録できない、という意味では漢字かなを使ってる日本人なんかは絶対にヨーロッパの銀行では登録できないだろうと思いますけど、それはまあ日本がEU加盟国ではないのでそこまでは要求されることはなさそうです。でもEU域内の加盟国で使われる文字に非対応なのはだめでしょうね。今ならUnicodeベースで作るでしょうから、EU向けに提供するサービスでアクセント記号を氏名に入れられないなんてことはなかなか起こらないとは思います。しかし、うっかり入力文字のフィルタでA-Za-zなんてことをしていると、こういう苦情が来る可能性は十分にありそうです。判決が話題になったのは今ですが、判決自体は2019年に出ていたものです。銀行は25年前のシステムを直せたんでしょうかね?via Twitter共有:Twitter Facebook

2021-10-29 05:49
jcs

Read more.

2021-10-29 18:10
Pixels of the Week – October 31, 2021 by Stéphanie Walter - UX Researcher & Designer.

2021-10-31 12:25

EBCDIC is incompatible with GDPR

Source

Dance

Reactions

33 thoughts on “EBCDIC is incompatible with GDPR”

Simon Vans-Colina

Andy Mabbett

Lee Willy Minifees

Jim Rees

Dror Harari

@edent

JohnH

Jan

JuggleT

Jan

Erkin Alp Güney

mauvedeity

Jan (2)

Erkin Alp Güney

Christopher Lord

Karl Williamson

Ryan

feeder of cats ''' neurodiverse they/them

Fefes Blog

reddit programming

PUNii 💉💉

Dave Cridland

Petru Ratiu

Blair Wyman

Timothy

José Ramírez

Charlie Stross

news.ycombinator.com

news.ycombinator.com

Trackbacks and Pingbacks

アクセント付きアルファベットの名前を登録できないEUの銀行はGDPR違反 – 秋元@サイボウズラボ・プログラマー・ブログ

jcs

Pixels of the Week – October 31, 2021 by Stéphanie Walter - UX Researcher & Designer.

What are your reckons? Cancel reply

Share this post on…

33 thoughts on “EBCDIC is incompatible with GDPR”

Lee Willy Minifees

Dror Harari

@edent

JohnH

Jan

JuggleT

Jan

Erkin Alp Güney

Jan (2)

Erkin Alp Güney

Karl Williamson

Dave Cridland

Blair Wyman

Timothy

José Ramírez

Trackbacks and Pingbacks

What are your reckons? Cancel reply