Some thoughts on "Hacking the Cis-tem"

By on   5 comments 750 words, read ~373 times.

I recently read a wonderful paper by Mar Hicks called "Hacking the Cis-tem" which is about database design in the 1960s and the nascent digital state's approach to transgender individuals.

It's a short and readable paper with some jaw-dropping anecdotes. Like the man who immediately got a pay rise after his transition, despite working in exactly the same job as before; women were on a lower pay scale...

At a basic level you can see why, when computer memory was measured in tens of kilobytes, it made sense to say male==0 and female==1. Why waste precious bits on something which could only ever be binary? Why create an option to change a data field which is immutable? Why design a schema which would allow a woman to be married to another woman?

And yet, even with those constraints, people were able to change their "official" gender within the database. Oh, sure, there were all sorts of cludges (both technical and political) - but it was possible.

The paper sparked four main thoughts for me.

There's No Such Thing As Immutable

For all the talk of Blockchain solving the world's issues (🤣) sometimes it is necessary to "rewrite history". People make mistakes. Assumptions change. Knowledge improves. Lots of facts, it turns out, are matters of perspective.

A really good example of this is time. I don't mean pesky things like timezones and leap seconds. I mean that, due to general relativity, one second on the moon is not equal to one second on Earth. How does your time-ordered database cope with that?

You might very well live in a culture where divorce is impossible, or where sexual consent cannot ever be revoked, or where a person can only be married to one other person at a time. But these are all societal conventions which are liable - and indeed likely - to change.

I'm almost tempted to say that the boolean type shouldn't exist in modern databases!

Diverse Teams Build Better Products

I don't know how many computer programmers in the 1960s were part of the LGBTQ+ community. And I don't know how accepting their colleagues would have been of them.

Perhaps you have read and memorised every single one of the Falsehoods Programmers Believe About... lessons. But surely it is more efficient to build a team who are empowered enough to confidently correct their colleagues' incorrect assumptions about how the world is arranged?

We bake rigid assumptions into our designs not out of malign intent (usually) but because we're ignorant. That's only shameful if we refuse to listen to other people's experiences.

Computers Serve Humans - not the other way around

Most of us have been forced to lie to a computer at one time or another. Perhaps it is a system which insists that you must have a US-style ZIP code. Or that your name must be longer that three characters. Or that you don't have an apostrophe in your email address. Or that your wife is Mrs, not Ms.

I know for sure that you've filled in a paper form where the boxes were too small and you've had to decide how to truncate your data.

Why? Because people have designed a schema which doesn't account for the variety in the world.

Today's constraints aren't tomorrow's

As I said at the start, it's understandable that designers designed around the constraints they faced. But these days, we have an awareness of the likely progress of technology.

It's said that the Apollo Moon landings were only possible because the designers skated to where the puck was going to be. They made reasonable assumptions about what technology was going to be developed in the future.

Yes, we should try and build things which perform well on existing and historic hardware. But we can't ignore the fact that tomorrow's computers will be smaller, faster, cheaper, and more efficient.

Does it make sense to store a human's name as:

  name VARCHAR(32) CHARACTER SET latin1

Probably not. Disk space is cheap and getting cheaper. Perhaps people of the future will have names consisting of 500 emoji? Or perhaps people with "exotic" Unicode characters will want to use our services.

Oh, I'm sure there will be a performance hit if every column is essentially unlimited. But that's an argument to design better database engines - not to limit human expression.

Read More

You can read "Hacking the Cis-tem" in the IEEE or, if that's not available to you, read the pre-print.

Share this post on…

5 thoughts on “Some thoughts on "Hacking the Cis-tem"

  1. @Edent Great post, but less sure about the last part. How much of a current performance hit should we take now - and should have been taken by database designers in the 1960s - to anticipate 500 emoji names?

    Future proofing is a great idea, so long as we proof against the right future. Investing for future benefit is great, investing for no benefit, rather less so. And by the time it's clear which of those it is, it's already a bit late.

    Some semi-related thoughts here
    The phoenix and the constitution

  2. @Edent in C, a bool is usually just a wrapper around an integer. A 0 is false, any other value is true. So you can in fact use it to represent all possible genders, you just need to adjust your code slightly to test the integer value of your bool.

    Is that deranged enough to qualify?

  3. @PubstratTrue. But as a counterpoint - the sewers of London.

    They were already committed to the hard & expensive process of digging them. So overspeccing for future proofing was seen as a sensible minor cost.

  4. When we were updating vcard to support gender the initial proposal was a SEX field with values 0 and 1, based on some 1970s spec.
    We pushed back on this and brought in the free text gender it has now.


What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.