<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>utf-8 &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/utf-8/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Fri, 23 Jan 2026 07:21:24 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>utf-8 &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[A small collection of text-only websites]]></title>
		<link>https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites/</link>
					<comments>https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 30 Dec 2025 12:34:20 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[blogging]]></category>
		<category><![CDATA[blogs]]></category>
		<category><![CDATA[text]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[utf-8]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=64224</guid>

					<description><![CDATA[A couple of years ago, I started serving my blog posts as plain text.  Add .txt to the end of any URl and get a deliciously lo-fi, UTF-8, mono[chrome&#124;space] alternative.  Here&#039;s this post in plain text - https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites.txt  Obviously a webpage without links is like a fish without a bicycle, but the joy of the web is that there are no…]]></description>
										<content:encoded><![CDATA[<p>A couple of years ago, I started <a href="https://shkspr.mobi/blog/2024/05/link-relalternate-typetext-plain/">serving my blog posts as plain text</a>.  Add <code>.txt</code> to the end of any URl and get a deliciously lo-fi, UTF-8, mono[chrome|space] alternative.</p>

<p>Here's this post in plain text - <a href="https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites.txt">https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites.txt</a></p>

<p>Obviously a webpage without links is like a fish without a bicycle, but the joy of the web is that there are no gatekeepers. People can try new concepts and, if enough people join in, it becomes normal.  I'm not saying the plain-text is the <em>best</em> web experience. But it is <em>an</em> experience. Perfect if you like your browsing fast, simple, and readable. There are no cookie banners, pop-ups, permission prompts, autoplaying videos, or garish colour schemes.</p>

<p>I'm certainly not the first person to do this, so I thought it might be fun to gather a list of websites which you browse in text-only mode.  If you know of any more - including your own site - please drop a comment in the box!</p>

<ul>
<li><a href="https://shkspr.mobi/blog/2024/05/link-relalternate-typetext-plain/">Terence Eden's blog</a> - add <code>.txt</code> to any URl.</li>
<li><a href="https://daringfireball.net/2025/10/apple_uk_lawsuit_app_store_commissions.text">Daring Fireball</a> - add <code>.text</code> to any URl.</li>
<li><a href="https://flower.codes/2025/10/23/onion-mirror.txt">Zach Flowers</a> - replace <code>.html</code> with <code>.txt</code>.</li>
<li><a href="https://fabien.benetou.fr/Content/SwappingPartsOfTheRestrictionStack?action=source">Fabien Benetou's PIM</a> - add <code>?action=source</code> to any URl.</li>
<li><a href="https://m0yng.uk/2025/03/Tracking-the-benefits-of-Solar-and-Battery.txt">M0YNG</a> - add <code>.txt</code> to any URl.</li>
<li><a href="https://gwern.net/speedrunning.md">Gwern</a> - add <code>.md</code> to any URl or send an HTTP Accept for Markdown.</li>
<li><a href="https://textplain.blog/">Dan Q's textplain.blog</a> - the <em>entire</em> blog is plain text!</li>
<li><a href="https://nooshu.com/feed/feed.txt">Matt Hobbs</a> - there is a <em>feed</em> of plaintext which allows you to read recent posts.</li>
<li><a href="https://www.bananas-playground.net/projekt/portagefilelist/index.txt">Bananas Playground</a> - add <code>index.txt</code> to any post. Also works with <code>index.md</code>.</li>
<li><a href="https://www.jorsys.org/index.md">Jorvik Systems</a> - change <code>.html</code> to <code>.md</code> for Markdown.</li>
<li><a href="https://blog.omgmog.net/post/moving-to-github-actions-and-adding-txt-posts.txt">Max Glenister's blog</a> - add <code>.txt</code> to any post's URl.</li>
<li><a href="https://notes.philippdubach.com/0003.txt">Philipp Dubach's notes</a> - add <code>.txt</code> to any post's URl.</li>
<li><a href="https://derickrethans.nl/php-500.txt">Derick Rethans' blog</a> - add <code>.txt</code> to any post's URl.</li>
<li><a href="https://4c6e.xyz/TEXT-MANIFEST.txt">Ricardson's blog</a> - add <code>.txt</code> to any post's URl.</li>
<li><a href="https://elle.sh/blog">elle's blog</a> - text mode only available via <code>curl elle.sh/blog</code>.</li>
<li><a href="https://www.benji.dog/notes/1761683274.txt">Benji.dog</a> - add <code>.txt</code> to any note's URl.</li>
</ul>

<p>If you'd like to add a site, please get in touch. The rules are simple - content which has the MIME type of <code>text/plain</code>. No HTML, no multimedia, no RTF, no XML, no ANSI colour escape sequences.</p>

<p>Emoji are fine though; emoji are cool.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=64224&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/12/a-small-collection-of-text-only-websites/feed/</wfw:commentRss>
			<slash:comments>22</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Why doesn't Disney+ support accents in profile names?]]></title>
		<link>https://shkspr.mobi/blog/2021/12/why-doesnt-disney-support-accents-in-profile-names/</link>
					<comments>https://shkspr.mobi/blog/2021/12/why-doesnt-disney-support-accents-in-profile-names/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 23 Dec 2021 12:34:58 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[ascii]]></category>
		<category><![CDATA[disney]]></category>
		<category><![CDATA[Star Wars]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[utf-8]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=41367</guid>

					<description><![CDATA[Because I&#039;m genetically pre-disposed to watch every piece of Star Wars content ever created, I signed up for a free trial of Disney&#039;s newest streaming service.  As part of onboarding, it asked me to create a profile name. This is typically done so that multi-user households can have separate profiles and preferences. Mum doesn&#039;t have her princess stories disrupting Dad&#039;s suggestions. And Junior…]]></description>
										<content:encoded><![CDATA[<p>Because I'm genetically pre-disposed to watch every piece of Star Wars content ever created, I signed up for a free trial of Disney's newest streaming service.</p>

<p>As part of onboarding, it asked me to create a profile name. This is typically done so that multi-user households can have separate profiles and preferences. Mum doesn't have her princess stories disrupting Dad's suggestions. And Junior doesn't see what filth their parents are watching late at night. All the better to build up detailed tracking profiles on you, my dear!</p>

<p>Naturally, my first thought was to see if this was exploitable in the form of a self reflected XSS. It was not. In fact, it didn't let any character though which wasn't A-Z and 0-9. To my surprise, it also allowed spaces. So no accents, apostrophes, macrons, or other pesky "foreign" characters.</p>

<p>Including, amusingly, the names of several Disney characters.
<img src="https://shkspr.mobi/blog/wp-content/uploads/2021/12/Apostrophe-fs8.png" alt="An apostrophe in Donald O'Duck causes the profile name to display an error." width="1104" height="425" class="aligncenter size-full wp-image-41368"></p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2021/12/Chinese-fs8.png" alt="A set of Chinese characters causes the profile name to display an error." width="1103" height="416" class="aligncenter size-full wp-image-41369">

<img src="https://shkspr.mobi/blog/wp-content/uploads/2021/12/Moana-fs8.png" alt="A macron on Moana causes the profile name to display an error." width="1103" height="442" class="aligncenter size-full wp-image-41370">

<img src="https://shkspr.mobi/blog/wp-content/uploads/2021/12/Accent-fs8.png" alt="The French name Lumière causes the profile name to display an error." width="1104" height="396" class="aligncenter size-full wp-image-41371">

<p>OK, that's a bit daft. But it's also needlessly exclusionary. Every class at school has a kid who has to fight for their right to have their name spelled correctly. There are plenty of blended families with hyphenated surnames. Not everyone in the UK speaks a Latin-derived language.</p>

<p>It is technologically illiterate to restrict profile names like this.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=41367&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2021/12/why-doesnt-disney-support-accents-in-profile-names/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Subsetting (Chinese) Fonts]]></title>
		<link>https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/</link>
					<comments>https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 22 May 2013 07:52:11 +0000</pubDate>
				<category><![CDATA[linux]]></category>
		<category><![CDATA[chinese]]></category>
		<category><![CDATA[font]]></category>
		<category><![CDATA[fonts]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[utf-8]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=8294</guid>

					<description><![CDATA[There are loads of really delightful Simplified and Traditional Chinese True Type Fonts available on the web.  There&#039;s only one issue - the file sizes are really large.  In many cases, too large to effectively use as a web-font.  For example, this calligraphy style font is 3.4MB.    The beautiful Paper Cut Font weighs in at 14MB!    That file-size is far to heavy to embed on a web page. …]]></description>
										<content:encoded><![CDATA[<p>There are loads of really delightful <a href="https://web.archive.org/web/20130520092556/http://chinesefont.brushes8.com/tag/simplified-chinese-font">Simplified and Traditional Chinese True Type Fonts</a> available on the web.  There's only one issue - the file sizes are really large.  In many cases, too large to effectively use as a web-font.</p>

<p>For example, this <a href="https://web.archive.org/web/20121219024439/http://chinesefont.brushes8.com/richwin-fonts/richwin-xing-kai-jian-fan-font.html">calligraphy style font</a> is 3.4MB.</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2013/05/Richwin-Xing-kai-jian-Fan-Font-fs8.png" alt="Richwin-Xing-kai-jian-Fan-Font-fs8" width="445" height="79" class="alignnone size-full wp-image-8296">

<p>The beautiful <a href="https://web.archive.org/web/20140608004145/http://chinesefont.brushes8.com/xin-di-paper-cut-font-simplified-chinese.html">Paper Cut Font</a> weighs in at 14MB!</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2013/05/Paper-Cut-Chinese-Font-fs8.png" alt="Paper Cut Chinese Font-fs8" width="487" height="223" class="alignnone size-full wp-image-8297">

<p>That file-size is far to heavy to embed on a web page.</p>

<h2 id="subsetting"><a href="https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/#subsetting">Subsetting</a></h2>

<p>Generally speaking, font files like .ttf contain a representation of every single character. 0-9, a-z, A-z, all the punctuation, non-English characters etc.</p>

<p>That's really useful if the font is installed on your computer and you want to write a document which <em>could</em> contain every character.  It's less helpful if you want to use a fancy font on your website's headers.</p>

<p>Subsetting is the act of creating a subset of a font.  That is, a font file which only contains specific characters.</p>

<p>Let's suppose that we only want a specific phrase rendered in this font.</p>

<pre>我很丢脸。我没有吃Fruity Oaty Bar</pre>

<p>We only need 19 unique characters - we can get rid of any character which doesn't appear in that heading.</p>

<p>There are sevel font manipulation tools available.  I've chosen <a href="https://web.archive.org/web/20130522120140/http://fonts.philip.html5.org/">Font Optimizer</a> which has an excellent live demo page.  The <a href="https://web.archive.org/web/20160322095140/https://bitbucket.org/philip/font-optimizer/overview">source code is on BitBucket</a>.</p>

<p>The command line syntax is really simple</p>

<pre>./subset.pl --chars="我很丢脸。我没有吃Fruity Oaty Bar" input.ttf output.ttf</pre>

<p>The file size reduction is impressive.  My original font was over 14MB.  The optimized one is 32<strong>K</strong>B</p>

<pre>14,066,456 input.ttf
   32,084 output.ttf
</pre>

<p>The process run instantly - fast enough to run as a web service to generate these fonts dynamically, I would think.</p>

<p>One could quite easily create a scrap of JavaScript which read the contents of a block of text and then requested a font which contained only the necessary characters.</p>

<p>Apparently, <a href="https://web.archive.org/web/20130620105955/http://scripts.sil.org/cms/scripts/page.php?item_id=OFL_web_fonts_and_RFNs#b4599c52">Monotype have a proprietary and patent-pending solution</a> to this rather trivial application.</p>

<h2 id="uses"><a href="https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/#uses">Uses</a></h2>

<p>Being able to subset fonts to reduce file size is incredibly useful.  Supposing you want a different font for body text, headers, and navigation.  Rather than having to load three large font files containing every character in the known universe, you could subset each one for only exactly the relevant characters.</p>

<p>This also has an interesting DRM like effect.  Some people don't want their shiny web fonts to be downloaded and used as a regular font.  With subsetting, the font only contains the specific characters.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=8294&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[A UTF-8 Aware substr_replace (for use in App.net)]]></title>
		<link>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/</link>
					<comments>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 06 Sep 2012 08:30:40 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[app.net]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[utf-8]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=6289</guid>

					<description><![CDATA[So, I stayed up bashing my head against a brick wall all last night! PHP&#039;s string functions aren&#039;t (yet) UTF-8 aware.  This is a replacement for subtr_replace which should work on UTF-8 Strings:  function utf8_substr_replace($original, $replacement, $position, $length) {     $startString = mb_substr($original, 0, $position, &#34;UTF-8&#34;);     $endString = mb_substr($original, $position + $length,…]]></description>
										<content:encoded><![CDATA[<p>So, I stayed up bashing my head against a brick wall all last night! PHP's string functions aren't (yet) UTF-8 aware.</p>

<p>This is a replacement for subtr_replace which <em>should</em> work on UTF-8 Strings:</p>

<pre lang="php">function utf8_substr_replace($original, $replacement, $position, $length)
{
    $startString = mb_substr($original, 0, $position, "UTF-8");
    $endString = mb_substr($original, $position + $length, mb_strlen($original), "UTF-8");

    $out = $startString . $replacement . $endString;

    return $out;
}
</pre>

<p>Take this typical string from App.net</p>

<pre>» Hello @bob how are you?</pre>

<p>According to App.net's entities, @bob occurs at position 9 and has length of 3.</p>

<p>Normally, we would just use substr_replace.</p>

<p>However, PHP will count any unicode character like "»" as two characters.  So it thinks that the position of @bob is 10.</p>

<p>Arse.</p>

<p>So, given we have the position of the substring, and its length, we can use <a href="http://uk3.php.net/mb_substr">PHP's multibyte functions</a> to split the string in two.</p>

<p>First,</p>

<pre lang="php">$startString = mb_substr($originalString, 0, $position, "UTF-8");
</pre>

<p>Gives us:</p>

<pre>» Hello @</pre>

<p>Secondly,</p>

<pre lang="php">$endString = mb_substr($originalString, $position + $length, mb_strlen($originalString), "UTF-8");
</pre>

<p>Gives us</p>

<pre> how are you?</pre>

<p>Finally, we stitch them back together</p>

<pre lang="php">$newString = $startString . $replacement . $endString;
</pre>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=6289&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2012/09/a-utf-8-aware-substr_replace-for-use-in-app-net/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
