<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>mysql &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Tue, 29 Apr 2025 21:07:57 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>mysql &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Doctrine - difference between bindValue() and setParameter() on prepared statements]]></title>
		<link>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/</link>
					<comments>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 11 May 2023 11:34:34 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[Symfony]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=45734</guid>

					<description><![CDATA[This pissed me off and I couldn&#039;t figure out what I was doing wrong. So I&#039;m blogging about my ignorance.  Imagine you&#039;re using Symfony and Doctrine to access a database. You are using prepared statements to prevent any SQL injection problems.  There are two main ways of doing this - and they disagree about how positional variables should be specified.  Data Retrieval And Manipulation  Here&#039;s a…]]></description>
										<content:encoded><![CDATA[<p>This pissed me off and I couldn't figure out what I was doing wrong. So I'm blogging about my ignorance.</p>

<p>Imagine you're using Symfony and Doctrine to access a database. You are using prepared statements to prevent any SQL injection problems.</p>

<p>There are two main ways of doing this - and they disagree about how positional variables should be specified.</p>

<h2 id="data-retrieval-and-manipulation"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#data-retrieval-and-manipulation">Data Retrieval And Manipulation</a></h2>

<p>Here's a fairly trivial SQL statement with a couple of variables:</p>

<pre><code class="language-php">$sql = "SELECT `userID` FROM `users` WHERE `firstname` LIKE ? AND `surname` LIKE ?";
$stmt = $conn-&gt;prepare($sql);
$stmt-&gt;bindValue(1, $user_input_1st_name);
$stmt-&gt;bindValue(2, $user_input_2nd_name);
$results = $stmt-&gt;executeQuery();
</code></pre>

<p>Pretty easy, right? Write your SQL as normal, but place <code>?</code> where you want user supplied variables to be. This uses <code>bindValue()</code> to set the variables in the query.</p>

<blockquote><p>The approach using question marks is called positional, because the values are bound in order from left to right to any question mark found in the previously prepared SQL query. That is why you specify the position of the variable to bind into the bindValue() method
<a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/data-retrieval-and-manipulation.html#dynamic-parameters-and-prepared-statements">Doctrine: Data Retrieval And Manipulation</a></p></blockquote>

<h2 id="query-builder"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#query-builder">Query Builder</a></h2>

<p>Doctrine also offer an <a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/query-builder.html">SQL Query Builder</a> - it looks like this:</p>

<pre><code class="language-php">$queryBuilder = $conn-&gt;createQueryBuilder();
$queryBuilder
    -&gt;select("userID")
    -&gt;from("users")
    -&gt;where("firstname LIKE ? AND surname LIKE ?")
    -&gt;setParameter(0, $user_input_1st_name)
    -&gt;setParameter(1, $user_input_2nd_name);
$results = $queryBuilder-&gt;executeQuery();
</code></pre>

<p>Notice the difference? Yes! <code>setParameter()</code> is <em>zero</em> based!</p>

<blockquote><p>The numerical parameters in the QueryBuilder API start with the needle 0.
<a href="https://www.doctrine-project.org/projects/doctrine-dbal/en/current/reference/query-builder.html#security-safely-preventing-sql-injection">SQL Query Builder</a></p></blockquote>

<h2 id="why-the-difference"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#why-the-difference">Why the difference?</a></h2>

<p>Because the universe hates you, I guess?</p>

<h2 id="solving-the-issue"><a href="https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/#solving-the-issue">Solving the issue</a></h2>

<p>I've sent a pull request to make the documentation clearer.  In the meantime, both methods accept <em>named</em> parameters.</p>

<pre><code class="language-php">$sql = "SELECT `userID` FROM `users` WHERE `firstname` LIKE :first";
$stmt-&gt;bindValue("first", $user_input_1st_name);

...

$queryBuilder
    -&gt;select("userID")
    -&gt;from("users")
    -&gt;where("firstname LIKE :first")
    -&gt;setParameter("first", $user_input_1st_name)
</code></pre>

<p>I hope that helps remove some confusion for future users. Even if it's only me!</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=45734&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2023/05/doctrine-difference-between-bindvalue-and-setparameter-on-prepared-statements/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Using Soundex to find Duplicate Database Entries]]></title>
		<link>https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/</link>
					<comments>https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 03 Nov 2020 12:45:25 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<category><![CDATA[OpenBenches]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=37066</guid>

					<description><![CDATA[Our community website - OpenBenches - has over seventeen thousand crowd-sourced entries.  The nature of user-generated content is that there are bound to be duplicates. Especially around popular walking routes.  Here&#039;s how I culled around 200 duplicates using the awesome power of SOUNDEX!  Soundex is a clever algorithm for reducing a string of characters into a string which roughly represents its …]]></description>
										<content:encoded><![CDATA[<p>Our community website - <a href="https://openbenches.org">OpenBenches</a> - has over seventeen <em>thousand</em> crowd-sourced entries.  The nature of user-generated content is that there are bound to be duplicates. Especially around popular walking routes.  Here's how I culled around 200 duplicates using the awesome power of SOUNDEX!</p>

<p><a href="https://en.wikipedia.org/wiki/Soundex">Soundex</a> is a clever algorithm for reducing a string of characters into a string which <em>roughly</em> represents its pronunciation. The "roughly" is key here. We could just search the database for <em>identical</em> entries, but users often mistype entries. So comparing sounds is a good way to gloss over those mistakes.</p>

<p>This gives us the Soundex of every inscription in the database:</p>

<pre><code class="language-sql">SELECT soundex(`inscription`) FROM `benches` WHERE 1 
</code></pre>

<p>Which results in:</p>

<pre><code class="language-_">|soundex(`inscription`)                               | 
|=====================================================|
|D531323161545613216323                               |
|J51616565324126143216316531414                       |
|D232624132635253263532154                            |
|H6432534215361312345252342525321625342323153423163...| 
|A2316215635                                          |
</code></pre>

<p>Effectively, it's like a fuzzy hash for text.</p>

<p>So, here's how to get a list of all the Soundexs which have duplicates:</p>

<pre><code class="language-sql">SELECT `inscription`, SOUNDEX(`inscription`), COUNT(SOUNDEX(`inscription`))
    FROM    `benches`
    WHERE `published` = 1
    GROUP BY SOUNDEX(`inscription`)
    HAVING COUNT(SOUNDEX(`inscription`)) &gt; 1
    LIMIT 0 , 1024"
</code></pre>

<p>Which I can then use to produce a list of duplicates:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/10/List-of-duplicates.png" alt="List of duplicates on a website." width="1220" height="782" class="aligncenter size-full wp-image-37068">

<p>And I can then get a list of all benches with a <em>specific</em> Soundex:</p>

<pre><code class="language-sql">SELECT `benchID`, `inscription`, `address`
    FROM    `benches`
    WHERE  SOUNDEX(`inscription`) = "A123"
    AND  `published` = 1
</code></pre>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/10/Duplicates-listed-on-a-website.png" alt="Benches on a  website. One is called &quot;Bertie&quot; the other &quot;Bert&quot;." width="1120" height="571" class="aligncenter size-full wp-image-37069">

<p>In this case, I can see that they're similar but not identical - so I don't need to merge them.</p>

<p>I started off with a thousand possible duplicates and, after going through each of them, merged a couple of hundred dupes. That was a fun weekend!</p>

<h2 id="other-strategies"><a href="https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/#other-strategies">Other Strategies</a></h2>

<p>There were a couple of other things I thought of, and then discarded, to deal with dupes.</p>

<h3 id="ask-the-user-on-upload"><a href="https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/#ask-the-user-on-upload">Ask the user on upload</a></h3>

<p>This is probably the simplest. If a new upload has a similar Soundex to a nearby bench, ask the user if it a duplicate.
But, actually, that's fraught with complexity. It puts a lot of pressure on a new user to get it right. And, frankly, we're a tiny community which needs all the users it can get. So I decided to put the pressure back on me, the admin.</p>

<p>There are quite a few benches near each other which have identical inscriptions. So asking a user to check <em>which</em> bench they meant could also be complicated.</p>

<p>Which leads on to...</p>

<h2 id="check-proximity"><a href="https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/#check-proximity">Check proximity</a></h2>

<p>There are lots of benches near each other. That's understandable. But <em>calculating</em> how near any two benches are is a bit more complex.  They're stored in the database with separate latitude and longitude values. So, we can use the <a href="https://en.wikipedia.org/wiki/Haversine_formula">Haversine formula</a> to find all benches within, say, 250 metres of position <code>1.23,4.56</code>:</p>

<pre><code class="language-sql">SELECT
(
    6371 * ACOS(COS(RADIANS(1.23)) *
    COS(RADIANS(`latitude`)) *
    COS(RADIANS(`longitude`) -
    RADIANS(4.56)) +
    SIN(RADIANS(1.23)) *
    SIN(RADIANS(`latitude`)))
)
AS distance, benchID, latitude, longitude, inscription, published
FROM benches
WHERE published = true AND present = true
HAVING distance &lt; .25
ORDER BY distance
</code></pre>

<p>That's fine - but when combined with Soundex, it becomes much more complex. Effectively you then have to group similar sounds by geographic closeness. And, frankly, I really can't be bothered!</p>

<p>If you'd like to help out, the <a href="https://github.com/openbenches/openbenches.org">code to OpenBenches is open source</a>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=37066&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2020/11/using-soundex-to-find-duplicate-database-entries/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[HOWTO: Regenerate Gravatars in WordPress]]></title>
		<link>https://shkspr.mobi/blog/2020/01/howto-regenerate-gravatars-in-wordpress/</link>
					<comments>https://shkspr.mobi/blog/2020/01/howto-regenerate-gravatars-in-wordpress/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sat, 25 Jan 2020 18:10:48 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[WordPress]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=33651</guid>

					<description><![CDATA[A troublesome plugin recently corrupted some of the avatars on my blog&#039;s comments. This is a quick HOWTO for regenerating them.  Gravatars are based on the MD5 hash of a user&#039;s email.  For some reason, the plugin had overwritten the avatar field with the text http://identicon  This MySQL query finds all the comment IDs which have that dodgy text:  SELECT     comment_id FROM `wp_commentmeta`    …]]></description>
										<content:encoded><![CDATA[<p>A troublesome plugin recently corrupted some of the avatars on my blog's comments. This is a quick HOWTO for regenerating them.</p>

<p>Gravatars are based on the <a href="https://en.gravatar.com/site/implement/images/">MD5 hash of a user's email</a>.  For some reason, the plugin had overwritten the <code>avatar</code> field with the text <code>http://identicon</code></p>

<p>This MySQL query finds all the comment IDs which have that dodgy text:</p>

<pre><code class="language-sql">SELECT 
   comment_id FROM `wp_commentmeta` 
   WHERE `meta_key` LIKE 'avatar' 
   AND `meta_value` LIKE 'http://identicon'
</code></pre>

<p>Using a <a href="https://www.mysqltutorial.org/mysql-subquery/">SubQuery</a> we can find all the email addresses for those comments - and generate an MD5 for them:</p>

<pre><code class="language-sql">SELECT 
   comment_author_email, 
   MD5(comment_author_email) AS md5 
   FROM `wp_comments` 
   WHERE comment_id IN
      (SELECT 
         comment_id FROM `wp_commentmeta` 
         WHERE `meta_key` LIKE 'avatar' 
         AND `meta_value` LIKE 'http://identicon')
</code></pre>

<p>A good first step, but how to get all those MD5s into URls and then back into the database?</p>

<p>First, they need to be <a href="https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_concat">concat</a>anated.</p>

<pre><code class="language-sql">SELECT 
   comment_id, 
   CONCAT("https://www.gravatar.com/avatar/", MD5(comment_author_email)) AS gravatar
   FROM `wp_comments` 
   WHERE comment_id IN
      (SELECT 
         comment_id FROM `wp_commentmeta` 
         WHERE `meta_key` LIKE 'avatar' 
         AND `meta_value` LIKE 'http://identicon')
</code></pre>

<p>Let's put that all together in one query:</p>

<pre><code class="language-sql">UPDATE wp_commentmeta
   JOIN wp_comments
   ON wp_commentmeta.comment_id = wp_comments.comment_ID
   SET meta_value = CONCAT("https://www.gravatar.com/avatar/", MD5(comment_author_email))
   WHERE meta_key LIKE 'avatar'
   AND meta_value LIKE "http://identicon"
</code></pre>

<p>Run that, and all your broken Gravatars will be regenerated.</p>

<p>Thanks to <a href="https://twitter.com/edjeff/status/1212430155655917569">Ed Jefferson</a> for his help with this!</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=33651&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2020/01/howto-regenerate-gravatars-in-wordpress/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Open Source Shakespeare (in MySQL)]]></title>
		<link>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/</link>
					<comments>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 20 Apr 2012 14:15:47 +0000</pubDate>
				<category><![CDATA[Shakespeare]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[shakespeare]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=5583</guid>

					<description><![CDATA[My good friend Richard Brent has often complained that my blog has very little Shakespeare content. Despite the domain name, I don&#039;t think I&#039;ve ever blogged about The Big S.  For shame!  Fear not, my Brentish-Boy, this post is all about Shakespeare. And MySQL....  Ahem...  When I first started shkspr.mobi it was intended to be an easy way to get Shakespeare on your phone.  At that time, there…]]></description>
										<content:encoded><![CDATA[<p>My good friend <a href="http://www.linkedin.com/in/rbrent">Richard Brent</a> has often complained that my blog has very little Shakespeare content. Despite the domain name, I don't think I've <em>ever</em> blogged about The Big S.  For shame!  Fear not, <a href="http://www.jabberwocky.com/carroll/jabber/jabberwocky.html">my Brentish-Boy</a>, this post is all about Shakespeare. And MySQL....</p>

<p>Ahem...</p>

<p>When I first started <a href="https://shkspr.mobi/">shkspr.mobi</a> it was intended to be an easy way to get Shakespeare on your phone.  At that time, there were no mobile formatted texts of his plays and sonnets, so I had to create them.  Finding Shakespeare's works in a suitable format for conversion wasn't too hard - but it meant lots of crufty code to read text files line-by-line. Yuck.</p>

<p>A few years later, I stumbled across <a href="http://www.opensourceshakespeare.org/">Open Source Shakespeare</a>.  The project grew out of <a href="http://www.opensourceshakespeare.org/info/paper_toc.php">Eric Johnson's MA thesis</a>.  It's a remarkably good idea with only one <em>minor</em> problem.  The database it uses is Microsoft Access.</p>

<p>MS Access, as a database, could best be described as</p>

<blockquote><p>deformed, crooked, old and sere, ill faced, worse bodied, shapeless everywhere, vicious, ungentle, foolish, blunt, unkind, stigmatical in making, worse in mind</p><p>(Comedy of Errors, Act IV, Scene II)</p></blockquote>

<p>There are a few Open Source Shakespeare projects on GitHub, but they don't seem very practical.</p>

<p>So, naturally, I've decided to create my own version of Shakespeare's works - in MySQL :-)</p>

<p>This is what it looks like:
<a href="https://github.com/edent/Open-Source-Shakespeare"><img src="https://shkspr.mobi/blog/wp-content/uploads/2012/04/Shkspr-MySQL.png" alt="Shkspr MySQL" title="Shkspr MySQL" width="600" height="240" class="aligncenter size-full wp-image-5592"></a>
You can <a href="https://github.com/edent/Open-Source-Shakespeare">download it from GitHub</a>.</p>

<p>I've stripped out a lot of the extraneous stuff from the original version - word counts, etc.  So it should be a fairly lean database which is easy to use.  I'm not a database professional, so I would be grateful if you could suggest any improvements. Either using this blog's comment form or on <a href="https://github.com/edent/Open-Source-Shakespeare">GitHub</a>.</p>

<p>There are four tables</p>

<h2 id="paragraphs"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#paragraphs">Paragraphs</a></h2>

<p>This is where the main body of text is.  A typical row will look like this</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>ParagraphID: 639015</li>
    <li>ParagraphNum: 3427</li>
    <li>CharID: hamlet</li>
    <li>PlainText: Has this fellow no feeling of his business, that he sings atngrave-making?</li>
    <li>Act: 5</li>
    <li>Scene: 1</li>
</ul>

<h2 id="works"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#works">Works</a></h2>

<p>This is what translates the "WorkID" into something human readable - plus some extra metadata</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>Title: Hamlet</li>
    <li>LongTitle: Tragedy of Hamlet, Prince of Denmark, The</li>
    <li>Date: 1600</li>
    <li>GenreType: Tragedy</li>
</ul>

<h2 id="character"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#character">Character</a></h2>

<p>This is what translates the CharID into a human readable name and description</p>

<ul>
    <li>charID: hamlet</li>
    <li>CharName: Hamlet</li>
    <li>Abbrev: Ham</li>
    <li>Works: Tragedy of Hamlet, Prince of Denmark, The</li>
    <li>Description: son of the former king and nephew to the present king</li>
</ul>

<h2 id="chapters"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#chapters">Chapters</a></h2>

<p>This gives the setting for each Act and Scene.</p>

<ul>
    <li>WorkID: hamlet</li>
    <li>ChapterID: 18893</li>
    <li>Act: 5</li>
    <li>Scene: 1</li>
    <li>Description: Elsinore. A churchyard.</li>
</ul>

<h2 id="whats-next"><a href="https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/#whats-next">What's Next?</a></h2>

<p>The next steps for the project are fairly obvious:</p>

<ol>
    <li>Write some high level example code to show people how to use the database.</li>
    <li>Make <a href="https://shkspr.mobi/">shkspr.mobi</a> a showcase site which runs off the database.</li>
    <li>Fix any bugs and inconsistencies that people find.</li>
</ol>

<p>You can <a href="https://github.com/edent/Open-Source-Shakespeare">download the Shakespeare MySQL Database from GitHub</a>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=5583&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2012/04/open-source-shakespeare-in-mysql/feed/</wfw:commentRss>
			<slash:comments>11</slash:comments>
		
		
			</item>
	</channel>
</rss>
