Aggressively Defensive Programming

How much checking do we perform that our code is running as intended?

I found a curious bug this weekend, which made me think about some of the assumptions that we use when programming.

Imagine sorting an array using JavaScript.

```var arr = [10, 5, 66, 8, 1, 3];
arr.sort();```

So far, so normal. Create an array of numbers, then sort that array. The result should always be [1, 3, 5, 8, 10, 66].

Would we ever need to do this?

```if (arr[0] < arr[5]) {
// Do something
} else {
// THIS SHOULD NEVER HAPPEN!
laws_of_physics_violated(arr);
}
```

Sorting an array in this way should never trigger an existential crisis. But suppose it did.

A Brief Word on Android Fragmentation

Android fragmentation has never really bothered me. It's usually a case of sensibly designing a UI to flow correctly and checking the capabilities of a device. No more complex than designing for the web.

This weekend, my father and I spent the weekend building our first joint Android app - First Bid Bridge - it's a simple enough app that we decided to use a WebView and write the game logic in JavaScript.

All was going well until I tested the game on my Dad's phone. It didn't work. All the answers it produced were completely at odds with the answers on my phone.

We tried this on every old phone I could find. Android 1.6, 2.3, 4.2 - they all worked. iPhone, Maemo, BlackBerry - hell, even my Internet TV worked flawlessly.

Painstakingly crawling through the logic, I found the problem. A "sorted" array was reversed on his phone! (Galaxy Ace running 2.3.6 for those who want to know.)

Diagnosing

Our app looks at a hand of cards. We split the hand into suits (Spade, Hearts, Diamonds, Clubs). We record how many cards are in each suit.

We then sort the hand by length - if two of the suits have the same length, they are sorted in the order Spades > Hearts > Diamonds > Clubs.

JavaScript allows us to define our own sorting function. Here's ours (in simplified form)

First, we sort by length:

```suitsArray.sort(function(x, y)
{
var o1 = x["length"];
var o2 = y["length"];
return o2-o1;
});```

Then, we sort by suit order (if the lengths are the same):

```suitsArray.sort(function(x, y)
{
var length1 = x["length"];
var length2 = y["length"];

if (length1 == length2)
{
var suit1 = x["suit"];
var suit2 = y["suit"];

{return false;}

if(suit1 == "hearts")
{return false;}

if(suit1 == "diamonds")
{return false;}

return true;
}
});
```

In every device I tried, this sort worked perfectly. Except on my dad's \$@*&ing phone. ARGH! What was even worse, is that we were an hour's journey away from each other, so I couldn't debug on the device.

So, that's where I came up with the idea of Aggressively Defensive Programming. It may not be an original idea - but this is how it works.

Trust, but verify

"Доверяй, но проверяй" as the Russians would say.

Assume that your computer is crazy, possessed, or being zapped by cosmic radiation and check its compliance every step of the way.

Essentially, after every operation, you should verify that the operation has worked. In this case, we did this (simplified):

```// The above sort

if ((arr[0]["length"] == arr[1]["length"]) && (arr[1]["suit"] == "spades")) {
// Dammit!
// Sort the array manually
} else {
// Carry on as normal
}```

Where Should This Be Used

There are four main scenarios where it makes sense to programs defensively.

1. Your code has a real impact on human health (aeroplane, nuclear reactor, medical device).
2. The computer you're running on is in a harsh environment and therefore liable to unpredictable behaviour (Mars Rover)
3. A multithreaded environment where your code cannot lock resources for itself.
4. You are running on an untrusted computer i.e. one other than your own.

For most of us, the first three are unlikely to trouble us. But the final one is interesting. Almost every piece of code we write will be run on a 3rd party's computer. One over which we have very little control.

But that's what we do when we release apps - or run websites. The code we have lovingly crafted is being run on machines which may have all manner of quirks.

How aggressive should we be with our defences? Do we assume that built in functions work - and only check ones we've written ourselves?

In Principia Mathematica Bertrand Russell famously provided proofs such as "1+1=2".

Is that what we need? Mathematically verify every computer our code runs on? That may be overkill. But I'm starting to come round to the idea of verifying every operation - even if just to throw an exception rather than proceeding in an error prone state.

I leave you with an instructional extract from Douglas Adam's Mostly Harmless:

On board the ship, everything was as it had been for millennia, deeply dark and Silent.

Click, hum.

At least, almost everything.

Click, click, hum.

Click, hum, click, hum, click, hum.

Click, click, click, click, click, hum.

Hmmm.

A low level supervising program woke up a slightly higher level supervising program deep in the ship's semi-somnolent cyberbrain and reported to it that whenever it went click all it got was a hum.

The higher level supervising program asked it what it was supposed to get, and the low level supervising program said that it couldn't remember exactly, but thought it was probably more of a sort of distant satisfied sigh, wasn't it? It didn't know what this hum was. Click, hum, click, hum. That was all it was getting.

The higher level supervising program considered this and didn't like it. It asked the low level supervising program what exactly it was supervising and the low level supervising program said it couldn't remember that either, just that it was something that was meant to go click, sigh every ten years or so, which usually happened without fail. It had tried to consult its error look-up table but couldn't find it, which was why it had alerted the higher level supervising program to the problem .

The higher level supervising program went to consult one of its own look-up tables to find out what the low level supervising program was meant to be supervising.

It couldn't find the look-up table .

Odd.

It looked again. All it got was an error message. It tried to look up the error message in its error message look-up table and couldn't find that either. It allowed a couple of nanoseconds to go by while it went through all this again. Then it woke up its sector function supervisor.

The sector function supervisor hit immediate problems. It called its supervising agent which hit problems too. Within a few millionths of a second virtual circuits that had lain dormant, some for years, some for centuries, were flaring into life throughout the ship. Something, somewhere, had gone terribly wrong, but none of the supervising programs could tell what it was. At every level, vital instructions were missing, and the instructions about what to do in the event of discovering that vital instructions were missing, were also missing.

Small modules of software - agents - surged through the logical pathways, grouping, consulting, re-grouping. They quickly established that the ship's memory, all the way back to its central mission module, was in tatters. No amount of interrogation could determine what it was that had happened. Even the central mission module itself seemed to be damaged.

This made the whole problem very simple to deal with. Replace the central mission module. There was another one, a backup, an exact duplicate of the original. It had to be physically replaced because, for safety reasons, there was no link whatsoever between the original and its backup. Once the central mission module was replaced it could itself supervise the reconstruction of the rest of the system in every detail, and all would be well.

Robots were instructed to bring the backup central mission module from the shielded strong room, where they guarded it, to the ship's logic chamber for installation.

This involved the lengthy exchange of emergency codes and protocols as the robots interrogated the agents as to the authenticity of the instructions. At last the robots were satisfied that all procedures were correct. They unpacked the backup central mission module from its storage housing, carried it out of the storage chamber, fell out of the ship and went spinning off into the void.

This provided the first major clue as to what it was that was wrong.

Further investigation quickly established what it was that had happened. A meteorite had knocked a large hole in the ship. The ship had not previously detected this because the meteorite had neatly knocked out that part of the ship's processing equipment which was supposed to detect if the ship had been hit by a meteorite.

3 thoughts on “Aggressively Defensive Programming”

1. Jon R says:

The problem here appears to be that your sort comparator function is completely broken. It doesn't even *look* like a comparator function. The interesting question isn't "why doesn't it work on your dad's phone?" but "why does it ever work?".

1. Hey Jon,

I'm using the following form (cribbed from Mozilla)

```function compare(a, b) {
if (a is less than b by some ordering criterion)
return -1;
if (a is greater than b by the ordering criterion)
return 1;
// a must be equal to b
return 0;
}```

(The function failed when using -1,1 or false,true)

Or am I missing something blindingly obvious here?

1. Jon R says:

Your first comparator looks like that, yes, but your second "sort by suit order" comparator looks nothing like that, and doesn't even return an integer. Remember that you must return 0 if the items are equal. Also, sorting twice doesn't work unless the sort is stable, and JavaScript's Array.sort() is not stable.

This site uses Akismet to reduce spam. Learn how your comment data is processed.