<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>languages &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/languages/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Fri, 09 Jan 2026 08:47:38 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>languages &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Should HTML's code blocks be translated?]]></title>
		<link>https://shkspr.mobi/blog/2026/01/should-htmls-blocks-be-translated/</link>
					<comments>https://shkspr.mobi/blog/2026/01/should-htmls-blocks-be-translated/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 16 Jan 2026 12:34:53 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[languages]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=63046</guid>

					<description><![CDATA[I was recently prompted to test my blog&#039;s layout when rendered in right-to-left text. Running a website through an automatic translator into a language like Arabic or Hebrew will show you any weird little layout glitches which might occur.  But mechanical translation is a bit of an unthinking brute.  In this example, I had a code snippet which contained the word &#34;link&#34;.    Should that word be…]]></description>
										<content:encoded><![CDATA[<p>I was recently prompted to <a href="https://bsky.app/profile/vale.rocks/post/3lxgvpipy4k2q">test my blog's layout when rendered in right-to-left text</a>. Running a website through an automatic translator into a language like Arabic or Hebrew will show you any weird little layout glitches which might occur.</p>

<p>But mechanical translation is a bit of an unthinking brute.  In this example, I had a code snippet which contained the word "link".</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/08/translate.webp" alt="HTML code block, one of the element names is rendered in Arabic." width="1008" height="567" class="aligncenter size-full wp-image-63048">

<p>Should that word be translated? Obviously not! The code isn't valid unless the element name is in English - and it probably doesn't make sense to reverse the text direction.</p>

<p>Luckily, the HTML specification allows authors to mark specific bits of their page as unsuitable for automatic translations. <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Global_attributes/translate">The <code>translate</code> global attribute</a> can be applied to your markup like this:</p>

<pre><code class="language-html">&lt;code translate="no"&gt;
   &amp;amp;lt;link … &amp;amp;gt;
   &amp;amp;lt;meta … &amp;amp;gt;
   &amp;amp;lt;strong&amp;amp;gt;Hello&amp;amp;lt;/strong&amp;amp;gt;
&lt;/code&gt;
</code></pre>

<p>Nothing inside that code block will be translated. Hurrah!</p>

<p>But there are some problems with this approach.</p>

<p>Consider this pseudo-code:</p>

<pre><code class="language-_">// Reverse the polarity of the neutron flow.
$neutron = $atom.flow( direction="backwards" );
</code></pre>

<p>Fairly obviously, the code itself shouldn't be translated. It simply won't run unless the syntax is precisely as written. But what about the comment at the top? It would probably be useful to have that translated, right?</p>

<p>It is possible to mark up different parts of a document to be translatable even if their parent isn't:</p>

<pre><code class="language-html">&lt;code translate="no"&gt;
   &lt;span translate="yes"&gt;// Reverse the polarity of the neutron flow.&lt;/span&gt;
   $neutron = $atom.flow( direction="backwards" );
&lt;/code&gt;
</code></pre>

<p>At least, that's my understanding of <a href="https://html.spec.whatwg.org/multipage/dom.html#attr-translate">the specification</a>.</p>

<p>This brings us on to another complex problem. Consider this code block which might be embedded in a page as an example:</p>

<pre><code class="language-js">// Ensure the age is calculated from the user's birthday
var age = today.date - user.birthday;
</code></pre>

<p>If translated into Chinese, the comment might say:</p>

<pre><code class="language-js">// 确保年龄是根据用户的生日计算的
var age = today.date - user.birthday;
</code></pre>

<p>But is it useful to have variable names be different between comments and the code?</p>

<p>In some contexts yes, in others no!</p>

<p>And that's where we hit the limits of the current crop of machine-translation algorithms. Without a holistic view of the entire page, and a semantic understanding of how previous words relate to subsequent words, there will always be glitches and gotchas like this.</p>

<p>For now, I'm marking my code blocks as non-translatable but letting comments be fully translated. If you have strong opinions about this - please leave a comment!</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=63046&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2026/01/should-htmls-blocks-be-translated/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Episode 19: Ancient Japanese]]></title>
		<link>https://shkspr.mobi/blog/2014/12/episode-19-ancient-japanese/</link>
					<comments>https://shkspr.mobi/blog/2014/12/episode-19-ancient-japanese/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 18 Dec 2014 06:32:13 +0000</pubDate>
				<category><![CDATA[About A Minute]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[podcast]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=20202</guid>

					<description><![CDATA[Talking to Kerri Russell about the Oxford Corpus of Old Japanese.    	🔊 Kerri Russell on Ancient Japanese🎤 Terence Eden 	 	 		💾 Download this audio file. 	     About A Minute is an amuse-bouche for podcast listeners. No long intro and outro. No waffling on. No adverts, competitions, arguing, or begging for iTunes reviews.  You get to listen to an interesting person chat for about a minute - that&#039;s…]]></description>
										<content:encoded><![CDATA[<p>Talking to <a href="https://web.archive.org/web/20150402160538/https://www.orinst.ox.ac.uk/staff/ea/japanese/krussel.html">Kerri Russell</a> about the <a href="http://vsarpj.orinst.ox.ac.uk/corpus/">Oxford Corpus of Old Japanese</a>.</p>

<p><img src="https://shkspr.mobi/blog/wp-content/uploads/2014/12/Oxford-Japanese-fs8.png" alt="Oxford Japanese-fs8" width="622" height="50" class="aligncenter size-full wp-image-20203">
</p><figure class="audio">
	<figcaption>🔊 Kerri Russell on Ancient Japanese<br>🎤 Terence Eden</figcaption>
	
	<audio controls="" loading="lazy" src="https://shkspr.mobi/blog/wp-content/uploads/2014/12/AAM-Kerri-Russell-Ancient-Japanese.mp3">
		<p>💾 <a href="https://shkspr.mobi/blog/wp-content/uploads/2014/12/AAM-Kerri-Russell-Ancient-Japanese.mp3">Download this audio file</a>.</p>
	</audio>
</figure><p></p>

<hr>

<p>About A Minute is an <em>amuse-bouche</em> for podcast listeners. No long intro and outro. No waffling on. No adverts, competitions, arguing, or begging for iTunes reviews.  You get to listen to an interesting person chat for about a minute - that's it!</p>

<hr>

<p>Get About A Minute as soon as each episode goes live.</p>

<p><a href="https://shkspr.mobi/blog/category/aam-podcast/feed/">Stick this Podcast Feed into your podcatcher</a>
Or you can <a href="https://itunes.apple.com/gb/podcast/about-a-minute/id939617328?mt=2&amp;uo=4">Subscribe on iTunes</a></p>

<p>Intro music <a href="https://www.youtube.com/watch?v=adxr3RGOdrI">"Gran Vals" performed by Brian Streckfus</a>.
<a href="http://thenounproject.com/term/stopwatch/14262/">Stopwatch Icon by Ilsur Aptukov from The Noun Project</a>.</p>

<p><a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" class="alignleft"></a>This podcast is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=20202&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2014/12/episode-19-ancient-japanese/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		<enclosure url="https://shkspr.mobi/blog/wp-content/uploads/2014/12/AAM-Kerri-Russell-Ancient-Japanese.mp3" length="0" type="audio/mpeg" />

			</item>
		<item>
		<title><![CDATA[QRpedia - Dealing With Minority Languages]]></title>
		<link>https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/</link>
					<comments>https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sun, 09 Oct 2011 14:48:02 +0000</pubDate>
				<category><![CDATA[qrpedia]]></category>
		<category><![CDATA[languages]]></category>
		<category><![CDATA[qr]]></category>
		<category><![CDATA[wikipedia]]></category>
		<guid isPermaLink="false">http://shkspr.mobi/blog/?p=4512</guid>

					<description><![CDATA[Humans have devised hundreds of thousands of languages with which to express themselves. Some, like Cornish are on the verge of extinction. Others, like Catalan and Welsh, are only used by a small number of speakers. Some, like New Norse, are created for political purposes.  All these languages are valuable and hugely important to their communities. Many have a Wikipedia version written in their…]]></description>
										<content:encoded><![CDATA[<p>Humans have devised hundreds of thousands of languages with which to express themselves. Some, like <a href="http://en.wikipedia.org/wiki/Cornish_language">Cornish</a> are on the verge of extinction. Others, like Catalan and Welsh, are only used by a small number of speakers. Some, like <a href="http://en.wikipedia.org/wiki/Norwegian_language_struggle">New Norse</a>, are created for political purposes.</p>

<p>All these languages are valuable and hugely important to their communities. Many have a Wikipedia version written in their language.</p>

<p>Unfortunately, very few phones support these languages.</p>

<img class="aligncenter size-full wp-image-4514" title="Phone showing list of languages" src="https://shkspr.mobi/blog/wp-content/uploads/2011/10/SC20111009-150255.png" alt="Phone showing list of languages" width="288" height="480">

<p>This poses a problem for QRpedia. They way the system works is this:</p>

<ol>
    <li>Read the phone's language</li>
    <li>Look for a suitable translation in Wikipedia</li>
    <li>Return the correct article</li>
    <li>If a translation doesn't exist, return a list of available articles</li>
</ol>

<p>Suppose The National Library of Wales has a QRpedia code for the <a href="http://en.wikipedia.org/wiki/Black_Book_of_Carmarthen">Black Book of Carmarthen</a>.
A Welsh speaker will probably wish to go to the <a href="http://cy.wikipedia.org/wiki/Llyfr_Du_Caerfyrddin">Welsh version of the article</a>.
However, their phone does not support the Welsh language (unless it is a <a href="http://news.bbc.co.uk/1/hi/wales/mid/8183247.stm">Samsung S5600</a>) and is set to English.</p>

<p>QRpedia, therefore, redirects them to the English version and doesn't give them a chance to read in their native language.</p>

<p>This is a problem we have faced with both Catalan and Norwegian.</p>

<h2 id="catalan"><a href="https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/#catalan">Catalan</a></h2>

<p>Catalan faces the very same problem as Welsh does in the previous theoretical example. Many people speak it but, because it's rare for a phone to support it, their phones are set to Spanish.</p>

<p>This was how we solved the problem:</p>

<ul>
    <li>If the QRpedia code was for a Catalan page (ca.wikipedia)...</li>
    <li>If the phone's language is Catalan (CA) take them to the Catalan Wikipedia.</li>
    <li>If the phone's language is Spanish (ES) take them to a language choice screen - they can then select between Spanish, Catalan, or any other available language.</li>
    <li>If the phone's language is anything else (say EN) take them to the article in their language.</li>
</ul>

<p>QRpedia doesn't store the user's language choice - so the user has to choose every time the scan which language they want.</p>

<p>The reasons we don't store the language choice is that it would be very hard to undo if the user made a mistake, or ever wanted to change their language.</p>

<h2 id="norwegian"><a href="https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/#norwegian">Norwegian</a></h2>

<p>The Norwegians have two languages - <a href="http://en.wikipedia.org/wiki/Bokm%C3%A5l">Bokmål</a> and <a href="http://en.wikipedia.org/wiki/Nynorsk">Nynorsk</a>.</p>

<p>The <a href="http://en.wikipedia.org/wiki/ISO_639-1">standard language codes</a> are NB and NN. However, most phones only support NB - with the language header of NB-NO.
To complicate matters, the NB Wikiedia is located at <a href="http://no.wikipedia.org/wiki/Nynorsk">NO.wikipedia</a>!</p>

<p>So, <a href="http://code.google.com/p/qrwp/issues/detail?id=4&amp;can=1 ">after much discussion with some Norwegians</a>, I discovered that comparatively few people read NN. So, we came up with the following fix.</p>

<ul>
    <li>If the phone's language is Bokmål (NB-NO) take them to the NO Wikipedia.</li>
    <li>If the phone's language is Nynorsk (NN-NO) take them to the NN Wikipedia.</li>
</ul>

<p>However, very few phones support NN (none have ever used QRpedia) so I'm not sure if this is the correct approach.</p>

<h2 id="others"><a href="https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/#others">Others</a></h2>

<p>There are lots of other languages with Wikipedia supports, but which aren't well supported on phones. <a href="http://meta.wikimedia.org/wiki/List_of_Wikipedias">Wikipedia is available in nearly 300 different languages</a> - from <a href="http://sco.wikipedia.org/wiki/Scots_leid">Scots</a> and <a href="http://simple.wikipedia.org/wiki/Simple_English_Wikipedia">Simple English</a> to <a href="http://eo.wikipedia.org/wiki/Esperanto">Esperanto</a> and <a href="http://la.wikipedia.org/wiki/Lingua_Latina">Latin</a>. Although, curiously, there's no separate Wikipedia for <a href="http://en.wikipedia.org/wiki/British_English">British English</a> - or other regional English variants, nor is there one in <a href="http://meta.wikimedia.org/wiki/History_of_the_Klingon_Wikipedia">Klingon</a></p>

<h2 id="the-future"><a href="https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/#the-future">The Future</a></h2>

<p>So, what should QRpedia do in the future? How should it handle all the thousands of languages in conjunction with the hundreds of Wikipedia languages?</p>

<p>That's where <strong>you</strong> come in.</p>

<p>If you've got a good idea on how we handle your favourite language - drop a comment on this blog.</p>

<p>If you're a coder, <a href="http://code.google.com/p/qrwp/">QRpedia is open-source</a>. Check out the code and leave a comment, or raise a bug.</p>

<p>We need <em>your</em> help to determine what we do next.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=4512&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2011/10/qrpedia-dealing-with-minority-languages/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
	</channel>
</rss>
