<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>database &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/database/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Sat, 04 Apr 2026 08:53:17 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>database &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Doctrine - difference between bindValue() and setParameter() on prepared statements]]></title>
		<link>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/</link>
					<comments>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 11 May 2023 11:34:34 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[Symfony]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=45734</guid>

					<description><![CDATA[This pissed me off and I couldn&#039;t figure out what I was doing wrong. So I&#039;m blogging about my ignorance.  Imagine you&#039;re using Symfony and Doctrine to access a database. You are using prepared statements to prevent any SQL injection problems.  There are two main ways of doing this - and they disagree about how positional variables should be specified.  Data Retrieval And Manipulation  Here&#039;s a…]]></description>
										<content:encoded><![CDATA[<p>This pissed me off and I couldn't figure out what I was doing wrong. So I'm blogging about my ignorance.</p>

<p>Imagine you're using Symfony and Doctrine to access a database. You are using prepared statements to prevent any SQL injection problems.</p>

<p>There are two main ways of doing this - and they disagree about how positional variables should be specified.</p>

<h2 id="data-retrieval-and-manipulation"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#data-retrieval-and-manipulation">Data Retrieval And Manipulation</a></h2>

<p>Here's a fairly trivial SQL statement with a couple of variables:</p>

<pre><code class="language-php">$sql = "SELECT `userID` FROM `users` WHERE `firstname` LIKE ? AND `surname` LIKE ?";
$stmt = $conn-&gt;prepare($sql);
$stmt-&gt;bindValue(1, $user_input_1st_name);
$stmt-&gt;bindValue(2, $user_input_2nd_name);
$results = $stmt-&gt;executeQuery();
</code></pre>

<p>Pretty easy, right? Write your SQL as normal, but place <code>?</code> where you want user supplied variables to be. This uses <code>bindValue()</code> to set the variables in the query.</p>

<blockquote><p>The approach using question marks is called positional, because the values are bound in order from left to right to any question mark found in the previously prepared SQL query. That is why you specify the position of the variable to bind into the bindValue() method
<a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/data-retrieval-and-manipulation.html#dynamic-parameters-and-prepared-statements">Doctrine: Data Retrieval And Manipulation</a></p></blockquote>

<h2 id="query-builder"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#query-builder">Query Builder</a></h2>

<p>Doctrine also offer an <a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/query-builder.html">SQL Query Builder</a> - it looks like this:</p>

<pre><code class="language-php">$queryBuilder = $conn-&gt;createQueryBuilder();
$queryBuilder
    -&gt;select("userID")
    -&gt;from("users")
    -&gt;where("firstname LIKE ? AND surname LIKE ?")
    -&gt;setParameter(0, $user_input_1st_name)
    -&gt;setParameter(1, $user_input_2nd_name);
$results = $queryBuilder-&gt;executeQuery();
</code></pre>

<p>Notice the difference? Yes! <code>setParameter()</code> is <em>zero</em> based!</p>

<blockquote><p>The numerical parameters in the QueryBuilder API start with the needle 0.
<a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/query-builder.html#security-safely-preventing-sql-injection">SQL Query Builder</a></p></blockquote>

<h2 id="why-the-difference"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#why-the-difference">Why the difference?</a></h2>

<p>Because the universe hates you, I guess?</p>

<h2 id="solving-the-issue"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#solving-the-issue">Solving the issue</a></h2>

<p>I've sent a pull request to make the documentation clearer.  In the meantime, both methods accept <em>named</em> parameters.</p>

<pre><code class="language-php">$sql = "SELECT `userID` FROM `users` WHERE `firstname` LIKE :first";
$stmt-&gt;bindValue("first", $user_input_1st_name);

...

$queryBuilder
    -&gt;select("userID")
    -&gt;from("users")
    -&gt;where("firstname LIKE :first")
    -&gt;setParameter("first", $user_input_1st_name)
</code></pre>

<p>I hope that helps remove some confusion for future users. Even if it's only me!</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=45734&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Some thoughts on "Hacking the Cis-tem"]]></title>
		<link>https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/</link>
					<comments>https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 04 Apr 2023 11:34:49 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[gender]]></category>
		<category><![CDATA[trans]]></category>
		<category><![CDATA[🏳️‍⚧️]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=45240</guid>

					<description><![CDATA[I recently read a wonderful paper by Mar Hicks called &#34;Hacking the Cis-tem&#34; which is about database design in the 1960s and the nascent digital state&#039;s approach to transgender individuals.  It&#039;s a short and readable paper with some jaw-dropping anecdotes. Like the man who immediately got a pay rise after his transition, despite working in exactly the same job as before; women were on a lower pay…]]></description>
										<content:encoded><![CDATA[<p>I recently read a wonderful paper by <a href="https://marhicks.com/profile.html">Mar Hicks</a> called "Hacking the Cis-tem" which is about database design in the 1960s and the nascent digital state's approach to transgender individuals.</p>

<p>It's a short and readable paper with some jaw-dropping anecdotes. Like the man who immediately got a pay rise after his transition, despite working in exactly the same job as before; women were on a lower pay scale...</p>

<p>At a basic level you can see why, when computer memory was measured in tens of kilobytes, it made sense to say <code>male==0</code> and <code>female==1</code>. Why waste precious bits on something which could only ever be binary? Why create an option to change a data field which is immutable? Why design a schema which would allow a woman to be married to another woman?</p>

<p>And yet, even with those constraints, people were able to change their "official" gender within the database. Oh, sure, there were all sorts of cludges (both technical and political) - but it <em>was</em> possible.</p>

<p>The paper sparked four main thoughts for me.</p>

<h2 id="theres-no-such-thing-as-immutable"><a href="https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#theres-no-such-thing-as-immutable">There's No Such Thing As Immutable</a></h2>

<p>For all the talk of Blockchain solving the world's issues (🤣) sometimes it is necessary to "rewrite history". People make mistakes. Assumptions change. Knowledge improves. Lots of facts, it turns out, are matters of perspective.</p>

<p>A really good example of this is time.  I don't mean pesky things like timezones and leap seconds. I mean that, due to general relativity, <a href="https://www.bbc.co.uk/newsround/64801599">one second on the moon is not equal to one second on Earth</a>.  How does your time-ordered database cope with that?</p>

<p>You might very well live in a culture where divorce is impossible, or where <a href="https://www.gov.uk/government/publications/criminal-law-rape-within-marriage">sexual consent cannot ever be revoked</a>, or where a person can only be married to one other person at a time. But these are all societal conventions which are liable - and indeed likely - to change.</p>

<p>I'm <em>almost</em> tempted to say that the <code>boolean</code> type shouldn't exist in modern databases!</p>

<h2 id="diverse-teams-build-better-products"><a href="https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#diverse-teams-build-better-products">Diverse Teams Build Better Products</a></h2>

<p>I don't know how many computer programmers in the 1960s were part of the LGBTQ+ community. And I don't know how accepting their colleagues would have been of them.</p>

<p>Perhaps you have read and memorised every single one of the <a href="https://github.com/kdeldycke/awesome-falsehood">Falsehoods Programmers Believe About...</a> lessons. But surely it is more efficient to build a team who are empowered enough to confidently correct their colleagues' incorrect assumptions about how the world is arranged?</p>

<p>We bake rigid assumptions into our designs not out of malign intent (usually) but because we're ignorant.  That's only shameful if we refuse to listen to other people's experiences.</p>

<h2 id="computers-serve-humans-not-the-other-way-around"><a href="https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#computers-serve-humans-not-the-other-way-around">Computers Serve Humans - not the other way around</a></h2>

<p>Most of us have been forced to lie to a computer at one time or another. Perhaps it is a system which insists that you <em>must</em> have a US-style ZIP code. Or that your name <em>must</em> be longer that three characters. Or that you don't have an apostrophe in your email address. Or that your wife is Mrs, not Ms.</p>

<p>I know for sure that you've filled in a paper form where the boxes were too small and you've had to decide how to truncate your data.</p>

<p>Why? Because people have designed a schema which doesn't account for the variety in the world.</p>

<h2 id="todays-constraints-arent-tomorrows"><a href="https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#todays-constraints-arent-tomorrows">Today's constraints aren't tomorrow's</a></h2>

<p>As I said at the start, it's understandable that designers designed around the constraints they faced. But these days, we have an awareness of the likely progress of technology.</p>

<p>It's said that the Apollo Moon landings were only possible because the designers <a href="https://web.archive.org/web/20230326154258/https://archive.canadianbusiness.com/blogs-and-comment/stop-using-gretzky-where-the-puck-is-quote/">skated to where the puck was <em>going</em> to be</a>. They made reasonable assumptions about what technology was going to be developed in the future.</p>

<p>Yes, we should try and build things which perform well on existing and historic hardware. But we can't ignore the fact that tomorrow's computers will be smaller, faster, cheaper, and more efficient.</p>

<p>Does it make sense to store a human's name as:</p>

<pre><code class="language-sql">CREATE TABLE people (
  name VARCHAR(32) CHARACTER SET latin1
);
</code></pre>

<p>Probably not. Disk space is cheap and getting cheaper. Perhaps people of the future will have names consisting of 500 emoji? Or perhaps people with "exotic" Unicode characters will want to use our services.</p>

<p>Oh, I'm sure there will be a performance hit if every column is essentially unlimited. But that's an argument to design better database engines - not to limit human expression.</p>

<h2 id="read-more"><a href="https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/#read-more">Read More</a></h2>

<p>You can <a href="https://doi.org/10.1109/MAHC.2019.2897667">read "Hacking the Cis-tem" in the IEEE</a> or, if that's not available to you, <a href="https://marhicks.com/writing/hicks-hackingthecistempreprint.pdf">read the pre-print</a>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=45240&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2023/04/some-thoughts-on-hacking-the-cis-tem/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[How do you store numbers with leading zeros?]]></title>
		<link>https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/</link>
					<comments>https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 21 Sep 2021 11:34:30 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[database]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=40355</guid>

					<description><![CDATA[I am very interested in your opinion on this.  Imagine that you work at a company which sells widgets. Each widget has a unique serial number. The number is a fixed length, and can contain leading zeros.  That is, the following are all valid identifiers:   00001 01010 12345   What data type would you use to store these data in a database?  This is one of those strong opinions, weakly held.  I&#039;m…]]></description>
										<content:encoded><![CDATA[<p>I am very interested in your opinion on this.</p>

<p>Imagine that you work at a company which sells widgets. Each widget has a unique serial number. The number is a fixed length, and can contain leading zeros.  That is, the following are all valid identifiers:</p>

<ul>
<li><code>00001</code></li>
<li><code>01010</code></li>
<li><code>12345</code></li>
</ul>

<p>What data type would you use to store these data in a database?</p>

<p>This is one of those <a href="https://webarchive.nationalarchives.gov.uk/ukgwa/20220201211123/https://www.nationalarchives.gov.uk/education/greatwar/usefulnotes/g1cs1s4u.htm">strong opinions, weakly held</a>.  I'm not sure there's a right answer to it.  A quick survey on Twitter<sup id="fnref:1"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#fn:1" class="footnote-ref" title="Which has obvious limitations!" role="doc-noteref">0</a></sup> was inconclusive:</p>

<blockquote class="social-embed" id="social-embed-1431172311974092801" lang="en" itemscope="" itemtype="https://schema.org/SocialMediaPosting"><header class="social-embed-header" itemprop="author" itemscope="" itemtype="https://schema.org/Person"><a href="https://twitter.com/edent" class="social-embed-user" itemprop="url"><img class="social-embed-avatar social-embed-avatar-circle" src="data:image/webp;base64,UklGRkgBAABXRUJQVlA4IDwBAACQCACdASowADAAPrVQn0ynJCKiJyto4BaJaQAIIsx4Au9dhDqVA1i1RoRTO7nbdyy03nM5FhvV62goUj37tuxqpfpPeTBZvrJ78w0qAAD+/hVyFHvYXIrMCjny0z7wqsB9/QE08xls/AQdXJFX0adG9lISsm6kV96J5FINBFXzHwfzMCr4N6r3z5/Aa/wfEoVGX3H976she3jyS8RqJv7Jw7bOxoTSPlu4gNbfXYZ9TnbdQ0MNnMObyaRQLIu556jIj03zfJrVgqRM8GPwRoWb1M9AfzFe6Mtg13uEIqrTHmiuBpH+bTVB5EEQ3uby0C//XOAPJOFv4QV8RZDPQd517Khyba8Jlr97j2kIBJD9K3mbOHSHiQDasj6Y3forATbIg4QZHxWnCeqqMkVYfUAivuL0L/68mMnagAAA" alt="" itemprop="image"><div class="social-embed-user-names"><p class="social-embed-user-names-name" itemprop="name">Terence Eden is on Mastodon</p>@edent</div></a><img class="social-embed-logo" alt="Twitter" src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%0Aaria-label%3D%22Twitter%22%20role%3D%22img%22%0AviewBox%3D%220%200%20512%20512%22%3E%3Cpath%0Ad%3D%22m0%200H512V512H0%22%0Afill%3D%22%23fff%22%2F%3E%3Cpath%20fill%3D%22%231d9bf0%22%20d%3D%22m458%20140q-23%2010-45%2012%2025-15%2034-43-24%2014-50%2019a79%2079%200%2000-135%2072q-101-7-163-83a80%2080%200%200024%20106q-17%200-36-10s-3%2062%2064%2079q-19%205-36%201s15%2053%2074%2055q-50%2040-117%2033a224%20224%200%2000346-200q23-16%2040-41%22%2F%3E%3C%2Fsvg%3E"></header><section class="social-embed-text" itemprop="articleBody">You want to record a fixed-length number in your database (i.e. always 5 digits long).<br>The number can have leading zeros (e.g. 00123).<br>What data type do you use?<hr class="social-embed-hr"><label for="poll_1_count">Numeric with ZEROFILL: (61)</label><br><meter class="social-embed-meter" id="poll_1_count" min="0" max="100" low="33" high="66" value="36.5">61</meter><br><label for="poll_2_count">Varchar or Text: (94)</label><br><meter class="social-embed-meter" id="poll_2_count" min="0" max="100" low="33" high="66" value="56.3">94</meter><br><label for="poll_3_count">Something else (what?): (12)</label><br><meter class="social-embed-meter" id="poll_3_count" min="0" max="100" low="33" high="66" value="7.2">12</meter><br></section><hr class="social-embed-hr"><footer class="social-embed-footer"><a href="https://twitter.com/edent/status/1431172311974092801"><span aria-label="1 likes" class="social-embed-meta">❤️ 1</span><span aria-label="11 replies" class="social-embed-meta">💬 11</span><span aria-label="0 reposts" class="social-embed-meta">🔁 0</span><time datetime="2021-08-27T08:30:43.000Z" itemprop="datePublished">08:30 - Fri 27 August 2021</time></a></footer></blockquote>

<p>Here, I attempt to provide some of the opinions on both sides of the argument. Feel free to supply your own arguments.</p>

<h2 id="it-isnt-a-number"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#it-isnt-a-number">It isn't a number</a></h2>

<p>Can you do maths on it? If not, it isn't a number. It's a text string that happens to only contain numbers.</p>

<p>For example, asking "what is the average of these three serial numbers?" is a meaningless question. "What is this serial number divided by two?" again - that has an answer with no semantic content.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>What if the serial has some semantic content in it? For example even numbers were made in one factory, odds in another. Or prime numbers are demonstration units.
It <em>might</em> be easier to grab information from a number type. But it's probably better to pull that data into separate fields.</p>

<h2 id="it-isnt-guaranteed-to-always-be-a-number"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#it-isnt-guaranteed-to-always-be-a-number">It isn't <em>guaranteed</em> to always be a number</a></h2>

<p>If this field was for the item's price, it's is <em>always</em> going to be a number. Sure, it might be a <code>float</code> or an <code>int</code> - but it'll be a number. Same if this was the item's height, width, or weight. Those values can <em>only</em> be numbers.</p>

<p>But there's nothing inherently "numbery" about a serial number. At any point, the boss could recommend adding Greek letters to it.</p>

<p>Given we can't do maths on it, and the "number" doesn't have any semantic content, there's no need to artificially restrict our database to a "number" type.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>The same is broadly true of <em>any</em> constraint. The maximum length could change. The validation rules might be updated.  Modern databases need to be able to cope with changes in business requirements.</p>

<h2 id="technical-constraints"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#technical-constraints">Technical Constraints</a></h2>

<p>I don't know of any number type which stores leading zeros. Please enlighten me if I'm wrong.  That means any data retrieved from the database has to be formatted before displaying it to the user.</p>

<p>That also means that searching the database requires data to be pre-formatted.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>All data should go through validation and sanitation before being displayed to the end user. Numbers aren't special in that regard.
Similarly, data should be carefully checked before being searched for.</p>

<h2 id="numbers-get-misformatted"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#numbers-get-misformatted">Numbers get misformatted</a></h2>

<p>We all know how Microsoft Excel will look at any number and try to interpret it as a date. It also has a tendency to strip leading zeros. And some formatters will automatically add thousands separators to any number they see.  Keeping the data as text reduces the risk of this happening.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>Excel will mangle anything that looks even vaguely like a number. Storing as text does not guarantee that it will be interpreted as text.</p>

<p>OK - I think that's the majority of the argument for not treating this as a number. What are the arguments on the other side?</p>

<h2 id="incorrect-data"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#incorrect-data">Incorrect Data</a></h2>

<p>If we allow text in this field, what happens if someone types a letter <code>O</code> rather than a number <code>0</code>? Having this be an <code>int</code> prevents these errors creeping in. Yes, have checks at the front end, but this provides defence in depth.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>As above. Data should be checked by the application before submitting to the database. The database should be checking the data before storing it.</p>

<h2 id="searching-and-sorting"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#searching-and-sorting">Searching and Sorting</a></h2>

<p>Suppose we want to get all widgets with a serial number <code>&gt;=</code> a specific ID. A number type is much better than string for those sorts of operations.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>This again assumes that the IDs have some semantic content. For example, <code>00012</code> was manufactured after <code>00008</code>. This may or may not be the case. Such operations are best performed on a field like "Manufactured Date".</p>

<h2 id="its-more-efficient"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#its-more-efficient">It's more efficient</a></h2>

<p>It is quicker and computationally cheaper to store integers rather than text. Searching is faster, disk space requirements are lower.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>It isn't the 1970s. We're not paying per bit. Unless we're storing billions of rows, or working on constrained hardware, this isn't a practical concern for most users.</p>

<h2 id="human-usability"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#human-usability">Human Usability</a></h2>

<p>It <em>looks</em> like a number. A human, on encountering one of these IDs is going to assume it is a number. Anything which makes it harder for humans to understand is going to cause problems.</p>

<h3 id="counterpoint"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#counterpoint">Counterpoint</a></h3>

<p>Things like phone numbers <em>look</em> like integers, but they aren't. Momentary human confusion is preferable to mangled or imprecise data.</p>

<h2 id="so-now-what"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#so-now-what">So now what?</a></h2>

<p>I lean to the side that says that this is a string with specific constraints. Namely <code>/^[0-9]{5}$/</code> (please supply your own <a href="https://www.google.com/search?q=regex+meme">regex meme</a>).</p>

<p>When a user enters a new serial number, it should be checked that it meets the constraints. If it doesn't, refuse to submit it until the user has formatted it correctly. When submitted, the database should check against the constraints, and refuse to accept non-matching strings.</p>

<p>The speed of searching and sorting is not meaningfully degraded by storing as a string.</p>

<p>But, I'm very aware that I could be wrong. This is my strong opinion, but if you can supply a better argument, I will drop my weak grip on it.</p>

<h2 id="chaotic-good"><a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#chaotic-good">Chaotic Good</a></h2>

<blockquote class="social-embed" id="social-embed-1431256352060497921" lang="en" itemscope="" itemtype="https://schema.org/SocialMediaPosting"><header class="social-embed-header" itemprop="author" itemscope="" itemtype="https://schema.org/Person"><a href="https://twitter.com/danielknell" class="social-embed-user" itemprop="url"><img class="social-embed-avatar social-embed-avatar-circle" src="data:image/webp;base64,UklGRiYCAABXRUJQVlA4IBoCAAAwCwCdASowADAAPqlKnEmmJKMhMdZt+MAVCUAXZgHNEEabaMCPjIC18vi48E4h2FDg1PkwotmcBB4SaVgLuv3XeyFGHFqF/cYZNdvVuue7C5cHvYTwg10oKtt93uey1oX2AAD+++ITMhkUzdnQU9DDGI976ipdfy3cMav3wwqYsFWF3k8nBZQFvC3Z2kbaXBtNcb/WCqjVWMaGYn9/EONzqsgDDdpOx3gc/HzKxyxwHgEPKyYLorevf5llqizlJ5jvus94/WMGMoFRGEnROmpaXd/TNTNzmyyU192tFh9HKLxDYtjAy1DtSeMm4GGdlVFNm7gXOQPA4YHso38Kw8GoLCQJ4KyGUUwwSz6hfw96N/gi3LLqBUE5PpFDBlPBUR0FMgDjRUQuK8lH8DLUtV6OytNcxoP+hLHH6CPXbBgQ9Xr9aUYVmhmw7/y19kMkcPA2PXsnrOSDZSO/faSUymO1Z3scC4HBpV628e7U+FXr6B8Eu0Wy4ESdtDPDFt1LG2Af/D+ENykGR6NrvlUAXgp3Sllt75yiW2Xr8vKnhbwhaX1bJySeJ3QkyZuBuHz83zdL1JN8tz5637veyfknu0P3FMDMBKVI0aOO8WwBv7DZBQIx6Zq8U2U7Ch+SYus31W6B1VtVdKrvj3YjYGA8MBmqxAQne4RYRYM2LVJ2N0AUmJCd5r6lVr24RBp2IDynaMkCNc5AP00ZaAAA" alt="" itemprop="image"><div class="social-embed-user-names"><p class="social-embed-user-names-name" itemprop="name">Daniel Knell</p>@danielknell</div></a><img class="social-embed-logo" alt="Twitter" src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%0Aaria-label%3D%22Twitter%22%20role%3D%22img%22%0AviewBox%3D%220%200%20512%20512%22%3E%3Cpath%0Ad%3D%22m0%200H512V512H0%22%0Afill%3D%22%23fff%22%2F%3E%3Cpath%20fill%3D%22%231d9bf0%22%20d%3D%22m458%20140q-23%2010-45%2012%2025-15%2034-43-24%2014-50%2019a79%2079%200%2000-135%2072q-101-7-163-83a80%2080%200%200024%20106q-17%200-36-10s-3%2062%2064%2079q-19%205-36%201s15%2053%2074%2055q-50%2040-117%2033a224%20224%200%2000346-200q23-16%2040-41%22%2F%3E%3C%2Fsvg%3E"></header><section class="social-embed-text" itemprop="articleBody"><small class="social-embed-reply"><a href="https://twitter.com/edent/status/1431172311974092801">Replying to @edent</a></small><a href="https://twitter.com/edent">@edent</a> 5x bigint fields, one for each digit.</section><hr class="social-embed-hr"><footer class="social-embed-footer"><a href="https://twitter.com/danielknell/status/1431256352060497921"><span aria-label="4 likes" class="social-embed-meta">❤️ 4</span><span aria-label="1 replies" class="social-embed-meta">💬 1</span><span aria-label="0 reposts" class="social-embed-meta">🔁 0</span><time datetime="2021-08-27T14:04:40.000Z" itemprop="datePublished">14:04 - Fri 27 August 2021</time></a></footer></blockquote>

<div id="footnotes" role="doc-endnotes">
<hr aria-label="Footnotes">
<ol start="0">

<li id="fn:1">
<p>Which has obvious limitations!&nbsp;<a href="https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/#fnref:1" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

</ol>
</div>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=40355&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2021/09/how-do-you-store-numbers-with-leading-zeros/feed/</wfw:commentRss>
			<slash:comments>12</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Civic Hygiene]]></title>
		<link>https://shkspr.mobi/blog/2013/11/civic-hygiene/</link>
					<comments>https://shkspr.mobi/blog/2013/11/civic-hygiene/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Mon, 25 Nov 2013 12:20:29 +0000</pubDate>
				<category><![CDATA[politics]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[dpa]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<category><![CDATA[porn]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=8530</guid>

					<description><![CDATA[Imagine, just for a moment, that the Government wanted to keep a record of everyone&#039;s sexuality.  They need to know this detailed demographic data because it will be highly useful in civic planning.  It will help them work out what provision needs to be made for sexual health services, how many children are likely to be born, how many schools to build, etc.  You trust the Government, you voted…]]></description>
										<content:encoded><![CDATA[<p>Imagine, just for a moment, that the Government wanted to keep a record of everyone's sexuality.  They need to know this detailed demographic data because it will be highly useful in civic planning.  It will help them work out what provision needs to be made for sexual health services, how many children are likely to be born, how many schools to build, etc.</p>

<p>You trust the Government, you voted for them, you and your friends have nothing to hide with regards to your sexuality.</p>

<p>But! Shock horror! After creating the database, the Government loses the election and the <a href="http://www.mirror.co.uk/news/uk-news/ugly-face-ukip-sunday-mirror-1531879">homophobes at UKIP</a> get in to power!</p>

<p>Now they have a database of every gay in the village, and can harass then, try to "cure" them, or make their lives a living hell.</p>

<p>Far fetched? Not really.  With Cameron's inane web filtering plan, the "black boxes" in ISPs which can record every click you make, and the selling of the your NHS details to private parties, we're in a situation where a malicious government could cause serious damage to us.</p>

<p>The security expert <a href="https://www.schneier.com/essays/archives/2010/01/us_enables_chinese_h.html">Bruce Schneier wrote a wonderful article for CNN</a> on how the existing surveillance state is leading to disastrous breaches of our private information.  He concludes by saying:</p>

<blockquote><p>It's bad civic hygiene to build technologies that could someday be used to facilitate a police state.
</p><p>-- <a href="https://www.schneier.com/essays/archives/2010/01/us_enables_chinese_h.html">Bruce Schneier on CNN</a></p></blockquote>

<p>We have to be careful that the apparatus we build cannot easily be misused for evil purposes.  Sure, even an innocuous toaster can be weaponised if someone is willing enough, but we should not fall into the trap of making systems which can easily be turned against the people.</p>

<p>It's probably sensible to build a database of which car belongs to which owner - it has an important civil use and would be hard to abuse (<a href="http://www.bbc.co.uk/news/uk-wales-17166286">although not impossible</a>).</p>

<p>Should we have a national database of, say, religious beliefs?  Almost instinctively the answer is no.  The memories of fascist dictators haunt our collective consciousness.  We have seen countless times how race and religious identity become death penalties.  We wouldn't countenance it.</p>

<p>Civic hygiene isn't about saying we distrust our current government - <strong>it's about not trusting the <em>next</em> government</strong>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=8530&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2013/11/civic-hygiene/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Open Source Shakespeare (in MySQL)]]></title>
		<link>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/</link>
					<comments>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 20 Apr 2012 14:15:47 +0000</pubDate>
				<category><![CDATA[Shakespeare]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[shakespeare]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=5583</guid>

					<description><![CDATA[My good friend Richard Brent has often complained that my blog has very little Shakespeare content. Despite the domain name, I don&#039;t think I&#039;ve ever blogged about The Big S.  For shame!  Fear not, my Brentish-Boy, this post is all about Shakespeare. And MySQL....  Ahem...  When I first started shkspr.mobi it was intended to be an easy way to get Shakespeare on your phone.  At that time, there…]]></description>
										<content:encoded><![CDATA[<p>My good friend <a href="http://www.linkedin.com/in/rbrent">Richard Brent</a> has often complained that my blog has very little Shakespeare content. Despite the domain name, I don't think I've <em>ever</em> blogged about The Big S.  For shame!  Fear not, <a href="http://www.jabberwocky.com/carroll/jabber/jabberwocky.html">my Brentish-Boy</a>, this post is all about Shakespeare. And MySQL....</p>

<p>Ahem...</p>

<p>When I first started <a href="https://shkspr.mobi/">shkspr.mobi</a> it was intended to be an easy way to get Shakespeare on your phone.  At that time, there were no mobile formatted texts of his plays and sonnets, so I had to create them.  Finding Shakespeare's works in a suitable format for conversion wasn't too hard - but it meant lots of crufty code to read text files line-by-line. Yuck.</p>

<p>A few years later, I stumbled across <a href="http://www.opensourceshakespeare.org/">Open Source Shakespeare</a>.  The project grew out of <a href="http://www.opensourceshakespeare.org/info/paper_toc.php">Eric Johnson's MA thesis</a>.  It's a remarkably good idea with only one <em>minor</em> problem.  The database it uses is Microsoft Access.</p>

<p>MS Access, as a database, could best be described as</p>

<blockquote><p>deformed, crooked, old and sere, ill faced, worse bodied, shapeless everywhere, vicious, ungentle, foolish, blunt, unkind, stigmatical in making, worse in mind</p><p>(Comedy of Errors, Act IV, Scene II)</p></blockquote>

<p>There are a few Open Source Shakespeare projects on GitHub, but they don't seem very practical.</p>

<p>So, naturally, I've decided to create my own version of Shakespeare's works - in MySQL :-)</p>

<p>This is what it looks like:
<a href="https://github.com/edent/Open-Source-Shakespeare"><img src="https://shkspr.mobi/blog/wp-content/uploads/2012/04/Shkspr-MySQL.png" alt="Shkspr MySQL" title="Shkspr MySQL" width="600" height="240" class="aligncenter size-full wp-image-5592"></a>
You can <a href="https://github.com/edent/Open-Source-Shakespeare">download it from GitHub</a>.</p>

<p>I've stripped out a lot of the extraneous stuff from the original version - word counts, etc.  So it should be a fairly lean database which is easy to use.  I'm not a database professional, so I would be grateful if you could suggest any improvements. Either using this blog's comment form or on <a href="https://github.com/edent/Open-Source-Shakespeare">GitHub</a>.</p>

<p>There are four tables</p>

<h2 id="paragraphs"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#paragraphs">Paragraphs</a></h2>

<p>This is where the main body of text is.  A typical row will look like this</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>ParagraphID: 639015</li>
    <li>ParagraphNum: 3427</li>
    <li>CharID: hamlet</li>
    <li>PlainText: Has this fellow no feeling of his business, that he sings atngrave-making?</li>
    <li>Act: 5</li>
    <li>Scene: 1</li>
</ul>

<h2 id="works"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#works">Works</a></h2>

<p>This is what translates the "WorkID" into something human readable - plus some extra metadata</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>Title: Hamlet</li>
    <li>LongTitle: Tragedy of Hamlet, Prince of Denmark, The</li>
    <li>Date: 1600</li>
    <li>GenreType: Tragedy</li>
</ul>

<h2 id="character"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#character">Character</a></h2>

<p>This is what translates the CharID into a human readable name and description</p>

<ul>
    <li>charID: hamlet</li>
    <li>CharName: Hamlet</li>
    <li>Abbrev: Ham</li>
    <li>Works: Tragedy of Hamlet, Prince of Denmark, The</li>
    <li>Description: son of the former king and nephew to the present king</li>
</ul>

<h2 id="chapters"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#chapters">Chapters</a></h2>

<p>This gives the setting for each Act and Scene.</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>ChapterID: 18893</li>
    <li>Act: 5</li>
    <li>Scene: 1</li>
    <li>Description: Elsinore. A churchyard.</li>
</ul>

<h2 id="whats-next"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#whats-next">What's Next?</a></h2>

<p>The next steps for the project are fairly obvious:</p>

<ol>
    <li>Write some high level example code to show people how to use the database.</li>
    <li>Make <a href="https://shkspr.mobi/">shkspr.mobi</a> a showcase site which runs off the database.</li>
    <li>Fix any bugs and inconsistencies that people find.</li>
</ol>

<p>You can <a href="https://github.com/edent/Open-Source-Shakespeare">download the Shakespeare MySQL Database from GitHub</a>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=5583&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/feed/</wfw:commentRss>
			<slash:comments>11</slash:comments>
		
		
			</item>
	</channel>
</rss>
