<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>What programming language is in this &lt;code&gt; block? &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/2024/08/what-programming-language-is-in-this-code-block/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Tue, 27 Aug 2024 21:35:43 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>What programming language is in this &lt;code&gt; block? &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[What programming language is in this <code> block?]]></title>
		<link>https://shkspr.mobi/blog/2024/08/what-programming-language-is-in-this-code-block/</link>
					<comments>https://shkspr.mobi/blog/2024/08/what-programming-language-is-in-this-code-block/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 27 Aug 2024 11:34:28 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[schema.org]]></category>
		<category><![CDATA[semantic web]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=52095</guid>

					<description><![CDATA[I&#039;m a little bit obsessed with the idea of Semantic markup. I want the words that I write to be understood my humans and machines.  Imagine this piece of code: print( &#34;Hello, world!&#34; )  Is that code example written in Python? C++? Basic? Go? Perhaps you&#039;re familiar enough with every programming language to tell - but most people aren&#039;t.  Wouldn&#039;t it be nice to give an indication of what…]]></description>
										<content:encoded><![CDATA[<p>I'm a little bit obsessed with the idea of Semantic markup. I want the words that I write to be understood my humans <em>and</em> machines.</p>

<p>Imagine this piece of code: <code>print( "Hello, world!" )</code></p>

<p>Is that code example written in Python? C++? Basic? Go? Perhaps you're familiar enough with every programming language to tell - but most people aren't.  Wouldn't it be nice to give an indication of <em>what</em> programming language is used in an example?</p>

<p>Here's how we might represent it in HTML:</p>

<pre><code class="language-_">&lt;pre&gt;
    &lt;code&gt;
        print( "Hello, world!" )
    &lt;/code&gt;
&lt;/pre&gt;
</code></pre>

<p>How do we let the browser, search engines, and humans know what language that's written in? It might seem obvious to use the <code>lang</code> attribute, right? We're writing in a programming language, so just use <code>&lt;code lang="python"&gt;</code>. Sadly, the HTML specification disagrees.</p>

<blockquote><p>The lang attribute specifies the primary language for the element's contents and for any of the element's attributes that contain text. Its value <strong>must be a valid BCP 47 language tag</strong>, or the empty string. <br> <a href="https://html.spec.whatwg.org/multipage/dom.html#attr-lang">HTML Specification 3.2.6.2 The lang and xml:lang attributes</a> (emphasis added)</p></blockquote>

<p>That means it must be a <em>human</em> language like <code>en</code> or <code>en-GB</code>. No Klingon or Elvish - and certainly no computer languages!</p>

<p>Does the specification give any clues about the <code>&lt;code&gt;</code> element?</p>

<blockquote><p>There is <strong>no formal way to indicate the language of computer code</strong> being marked up. Authors who wish to mark code elements with the language used, e.g. so that syntax highlighting scripts can use the right rules, can use the class attribute, e.g. by adding a class prefixed with "language-" to the element.<br><a href="https://html.spec.whatwg.org/multipage/text-level-semantics.html#the-code-element">HTML Specification 4.5.15 The code element</a> (emphasis added)</p></blockquote>

<p>So we have to turn to our old friend Schema.org! There is a <a href="https://schema.org/SoftwareSourceCode"><code>SoftwareSourceCode</code> type</a> which is used for exactly this case. Sadly, there is no example documentation because Google likes to start up a project but never quite finish it.</p>

<p>Here's how to write a code snippet in HTML and have it semantically expose the programming language used:</p>

<pre><code class="language-HTML">&lt;pre itemscope itemtype="https://schema.org/SoftwareSourceCode"&gt;
    &lt;span itemprop="programmingLanguage"&gt;Python&lt;/span&gt;
    &lt;meta itemprop="codeSampleType" content="Example"&gt;
    &lt;code itemprop="text"&gt;
        print( "Hello, world!" )
    &lt;/code&gt;
&lt;/pre&gt;
</code></pre>

<p>If you <a href="https://validator.schema.org/#url=https%3A%2F%2Fshkspr.mobi%2Fblog%2F2024%2F08%2Fwhat-programming-language-is-in-this-code-block%2F">run that through the validator</a> you'll see what a computer sees:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2024/08/schema-fs8.png" alt="Semantic representation of the code." width="1648" height="360" class="aligncenter size-full wp-image-52102">

<p>The <code>programmingLanguage</code> is a string - so you can write anything you like in there. You can optionally add a <code>codeSampleType</code> which, again, is a free-text field.</p>

<p>The <code>&lt;meta&gt;</code> items are only viewable to machines. You could also them to the user if you wanted, using a <code>&lt;span&gt;</code> or other suitable element.</p>

<h2 id="alternatives"><a href="https://shkspr.mobi/blog/2024/08/what-programming-language-is-in-this-code-block/#alternatives">Alternatives</a></h2>

<p>It is possible to define <a href="https://www.rfc-editor.org/rfc/rfc5646#section-2.2.6">private subtags of languages</a> for example <code>en-x-python</code> - which could mean "Comments written in English, using the private Python extension. Or even just <code>x-python</code>. That then leads on to how you describe a language - but while <a href="https://www.iana.org/assignments/media-types/application/vnd.acucobol">COBOL has a MIME type</a> not all languages do.  There are some unofficial ones like <a href="https://mimetype.io/text/x-python"><code>text/x-python</code></a> though.</p>

<p>But, of course, <a href="https://sonomu.club/@threedaymonk/112970802651966367">programming languages aren't really languages</a> - so using <code>lang</code> probably isn't suitable.</p>

<p>A <a href="https://developer.mozilla.org/en-US/docs/Learn/HTML/Howto/Use_data_attributes"><code>data-</code> attribute</a> might also work. Adding <code>data-code="python"</code> would allow CSS to style specific code blocks. But data attributes are private to a page, and generally aren't standardised.</p>

<p>I <em>think</em> this is a gap in the specification. I think there ought to be a <code>code-lang</code> attribute or similar. Perhaps something like:</p>

<pre><code class="language-HTML">&lt;code code-lang="python;3.6"&gt;
    print( "Hello, world!" )
&lt;/code&gt;
</code></pre>

<p>Which could allow authors to semantically give the name - and possibly version - of the programming language they are writing in.</p>

<p>Thoughts?</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=52095&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/08/what-programming-language-is-in-this-code-block/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
	</channel>
</rss>
