<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>A UTF-8 Aware substr_replace (for use in App.net) &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Fri, 06 Oct 2023 15:40:21 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>A UTF-8 Aware substr_replace (for use in App.net) &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[A UTF-8 Aware substr_replace (for use in App.net)]]></title>
		<link>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/</link>
					<comments>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 06 Sep 2012 08:30:40 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[app.net]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[utf-8]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=6289</guid>

					<description><![CDATA[So, I stayed up bashing my head against a brick wall all last night! PHP&#039;s string functions aren&#039;t (yet) UTF-8 aware.  This is a replacement for subtr_replace which should work on UTF-8 Strings:  function utf8_substr_replace($original, $replacement, $position, $length) {     $startString = mb_substr($original, 0, $position, &#34;UTF-8&#34;);     $endString = mb_substr($original, $position + $length,…]]></description>
										<content:encoded><![CDATA[<p>So, I stayed up bashing my head against a brick wall all last night! PHP's string functions aren't (yet) UTF-8 aware.</p>

<p>This is a replacement for subtr_replace which <em>should</em> work on UTF-8 Strings:</p>

<pre lang="php">function utf8_substr_replace($original, $replacement, $position, $length)
{
    $startString = mb_substr($original, 0, $position, "UTF-8");
    $endString = mb_substr($original, $position + $length, mb_strlen($original), "UTF-8");

    $out = $startString . $replacement . $endString;

    return $out;
}
</pre>

<p>Take this typical string from App.net</p>

<pre>» Hello @bob how are you?</pre>

<p>According to App.net's entities, @bob occurs at position 9 and has length of 3.</p>

<p>Normally, we would just use substr_replace.</p>

<p>However, PHP will count any unicode character like "»" as two characters.  So it thinks that the position of @bob is 10.</p>

<p>Arse.</p>

<p>So, given we have the position of the substring, and its length, we can use <a href="http://uk3.php.net/mb_substr">PHP's multibyte functions</a> to split the string in two.</p>

<p>First,</p>

<pre lang="php">$startString = mb_substr($originalString, 0, $position, "UTF-8");
</pre>

<p>Gives us:</p>

<pre>» Hello @</pre>

<p>Secondly,</p>

<pre lang="php">$endString = mb_substr($originalString, $position + $length, mb_strlen($originalString), "UTF-8");
</pre>

<p>Gives us</p>

<pre> how are you?</pre>

<p>Finally, we stitch them back together</p>

<pre lang="php">$newString = $startString . $replacement . $endString;
</pre>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=6289&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
