<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>metadata &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Wed, 29 Oct 2025 11:38:33 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>metadata &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Fixing "Date/time not in ISO 8601 format" in Google Search Console]]></title>
		<link>https://shkspr.mobi/blog/2025/12/fixing-date-time-not-in-iso-8601-format-in-google-search-console/</link>
					<comments>https://shkspr.mobi/blog/2025/12/fixing-date-time-not-in-iso-8601-format-in-google-search-console/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 24 Dec 2025 12:34:43 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[schema.org]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=62176</guid>

					<description><![CDATA[I like using microdata within my HTML to provide semantic metadata.  One of my pages had this scrap of code on it:  &#60;time    itemprop=&#34;datePublished&#34;    itemscope    datetime=&#34;2025-06-09T11:27:06+01:00&#34;&#62;9 June 2025 11:27&#60;/time&#62;   The Google Search Console was throwing this error:    I was fairly sure that was a valid ISO 8601 string. It certainly matched the description in the Google…]]></description>
										<content:encoded><![CDATA[<p>I like using microdata within my HTML to provide semantic metadata.  One of my pages had this scrap of code on it:</p>

<pre><code class="language-html">&lt;time
   itemprop="datePublished"
   itemscope
   datetime="2025-06-09T11:27:06+01:00"&gt;9 June 2025 11:27&lt;/time&gt;
</code></pre>

<p>The Google Search Console was throwing this error:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/07/Datetime-not-in-ISO-8601-format-in-field-datePublished.webp" alt="Date/time not in ISO 8601 format in field 'datePublished' Items with this issue are invalid. Invalid items are not eligible for Google Search's rich results" width="690" height="180" class="aligncenter size-full wp-image-62177">

<p>I was fairly sure that was a valid ISO 8601 string. It certainly matched <a href="https://developers.google.com/search/docs/appearance/structured-data/discussion-forum#microdata">the description in the Google documentation</a>. Nevertheless, I fiddled with a few different formats, but all failed.</p>

<p>On <a href="https://support.google.com/webmasters/thread/359976663/iso8601-string-not-validating?msgid=360727451#">the advice</a> of <a href="https://www.nearby.org.uk/">Barry Hunter</a>, I tried changing the <code>datetime</code> attribute to <code>content</code>. That also didn't work.</p>

<p>Then I looked closely at the code.</p>

<p>The issue is the <code>itemscope</code>. Removing that allowed the code to pass validation.  But why?</p>

<p>Here's what <a href="https://schema.org/docs/gs.html#microdata_itemscope_itemtype">the Schema.org documentation</a> says:</p>

<blockquote><p>By adding itemscope, you are specifying that the HTML contained in the block is about a particular item.</p></blockquote>

<p>The <a href="https://html.spec.whatwg.org/multipage/microdata.html#attr-itemscope">HTML specification</a> gives this example:</p>

<pre><code class="language-html">&lt;div itemscope&gt;
   &lt;img itemprop="image" src="google-logo.png" alt="Google"&gt;
&lt;/div&gt;
</code></pre>

<p>Here, the <code>image</code> property is the <em>value</em> of the element. In this case <code>google-logo.png</code>. So what's the problem with the <code>time</code> example?</p>

<p>Well, <code>&lt;image&gt;</code> is a <em>void</em> element. It doesn't have any HTML content - so the metadata is taken from the <code>src</code> attribute.</p>

<p>But <code>&lt;time&gt;</code> is <em>not</em> a void element. It <em>does</em> contain HTML. So something like this would be valid:</p>

<pre><code class="language-html">&lt;time
   itemprop="datePublished"
   itemscope
&gt;2025-06-09T11:27:06+01:00&lt;/time&gt;
</code></pre>

<p>The text contained by the element is a valid ISO8601 string.</p>

<p>My choice was either to present the ISO8601 string to anyone viewing the page, or simply to remove the <code>itemscope</code>.  So I chose the latter.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=62176&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/12/fixing-date-time-not-in-iso-8601-format-in-google-search-console/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Is IPA furigana a bad idea?]]></title>
		<link>https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/</link>
					<comments>https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 10 Oct 2024 11:34:17 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[schema.org]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=53099</guid>

					<description><![CDATA[My name is Terence(/ˈtɛɹəns) Eden(ˈiːdən/).  Modern HTML allows the user to use &#60;ruby&#62; to annotate text.  This is usually used for furigana - which allows pronunciation to be placed above words.  For example: &#34;シン・ゴジラ (Shin Godzilla)&#34; shows you how to pronounce both words if you are unfamiliar with kanji.  The text can be any language or use any characters. In Japanese, it is quite often used to sh…]]></description>
										<content:encoded><![CDATA[<p>My name is <ruby><rb>Terence</rb><rp>(</rp><rt>/ˈtɛɹəns</rt><rp>)</rp><rb> Eden</rb><rp>(</rp><rt>ˈiːdən/</rt><rp>)</rp></ruby>.</p>

<p>Modern HTML allows the user to use <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ruby"><code>&lt;ruby&gt;</code></a> to annotate text.</p>

<p>This is <em>usually</em> used for <a href="https://en.wikipedia.org/wiki/Furigana">furigana</a> - which allows pronunciation to be placed above words.</p>

<p>For example: "<ruby>シン・ゴジラ <rp>(</rp><rt>Shin Godzilla</rt><rp>)</rp></ruby>" shows you how to pronounce both words if you are unfamiliar with <a href="https://en.wikipedia.org/wiki/Kanji">kanji</a>.  The text can be any language or use any characters. In Japanese, it is quite often used to show phonetic pronunciation using <a href="https://en.wikipedia.org/wiki/Hiragana">hiragana</a>.</p>

<p>Because English is a <a href="https://www.reddit.com/r/linguistics/comments/anzy8o/the_common_saying_i_see_on_reddit_is_english_is_3/">composite language</a><sup id="fnref:whore"><a href="https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/#fn:whore" class="footnote-ref" title="Or a mongrel language" role="doc-noteref">0</a></sup>, it isn't always easy for people to pronounce words<sup id="fnref:vid"><a href="https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/#fn:vid" class="footnote-ref" title="Yes, I've seen that funny Tiktok. And that one." role="doc-noteref">1</a></sup>.</p>

<p>So I have abused(?) the ruby syntax to show the <a href="https://en.wikipedia.org/wiki/International_Phonetic_Alphabet">International Phonetic Alphabet</a> above the English words.</p>

<p>Is this a <em>good</em> idea? Is it a valid use of the syntax? Is it semantically correct? I don't know. But I do now know that it is <em>possible</em>.</p>

<p>I doubt the majority of people know the IPA, so it is of dubious use. It does make my name's pronunciation more apparent to machines.</p>

<p>An alternative is to use <a href="https://schema.org">Schema.org</a>. For example, my <a href="https://edent.tel">contact page</a> has the following microdata:</p>

<pre><code class="language-html">&lt;main itemscope itemtype="https://schema.org/Person"&gt;
    &lt;header itemprop="name"&gt;
        &lt;h1&gt;
            &lt;span itemprop="givenName"&gt;Terence&lt;/span&gt;&amp;nbsp;
            &lt;span itemprop="familyName"&gt;Eden&lt;/span&gt;
            &lt;audio id="audioPlayer" src="Terence_Eden.mp3" 
                    itemscope itemprop="additionalType" 
                    itemtype="https://schema.org/PronounceableText"&gt;
                &lt;meta itemprop="phoneticText"       content="/ˈtɛɹəns ˈiːdən/"&gt;
                &lt;meta itemprop="inLanguage"         content="en-GB"&gt;
                &lt;meta itemprop="textValue"          content="Terence Eden"&gt;
                &lt;meta itemprop="speechToTextMarkup" content="IPA"&gt;
            &lt;/audio&gt;
</code></pre>

<p>That allows humans to listen to the pronunciation of my name, and machines to see the IPA version.</p>

<p>Is there a better, more accessible, more useful way of encoding how to pronounce text?</p>

<div id="footnotes" role="doc-endnotes">
<hr aria-label="Footnotes">
<ol start="0">

<li id="fn:whore">
<p>Or a <a href="https://en.wikiquote.org/wiki/James_Nicoll">mongrel language</a>&nbsp;<a href="https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/#fnref:whore" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:vid">
<p>Yes, I've seen that funny Tiktok. And that one.&nbsp;<a href="https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/#fnref:vid" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

</ol>
</div>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=53099&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/10/is-ipa-furigana-a-bad-idea/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[WebMentions, Privacy, and DDoS - Oh My!]]></title>
		<link>https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/</link>
					<comments>https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Tue, 29 Nov 2022 12:34:15 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[mastodon]]></category>
		<category><![CDATA[MastodonAPI]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<category><![CDATA[ogp]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=44259</guid>

					<description><![CDATA[Mastodon - the distributed social network - has two interesting challenges when it comes to how users share links.  I&#039;d like to discuss those issues and suggest a possible way forward.  When you click on a link on my website which takes you to another website, your browser sends a Referer. This says to the other site &#34;Hey, I came here using a link on shkspr.mobi&#34;.  This is useful because it lets…]]></description>
										<content:encoded><![CDATA[<p>Mastodon - the distributed social network - has two interesting challenges when it comes to how users share links.  I'd like to discuss those issues and suggest a possible way forward.</p>

<p>When you click on a link on my website which takes you to another website, your browser sends a <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer">Referer</a><sup id="fnref:splel"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#fn:splel" class="footnote-ref" title="This is a spleling mistake which is part of the specification so cannot be changed." role="doc-noteref">0</a></sup>. This says to the other site "Hey, I came here using a link on <code>shkspr.mobi</code>".  This is useful because it lets a site owner know who is linking to them.  I <em>love</em> seeing which weird and wonderful sites have linked to my content.</p>

<p>It is also something of a privacy nightmare as it lets sites see who is clicking and from where they're clicking. So Mastodon sets a <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Link_types/noreferrer"><code>noreferrer</code></a><sup id="fnref:spell"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#fn:spell" class="footnote-ref" title="This one is spelled correctly. Which makes life confusing for all involved." role="doc-noteref">1</a></sup> attribute on all links. This tells the browser not to send the Referer.</p>

<p>This means sites no longer know <em>who</em> is sending them traffic.</p>

<iframe src="https://masto.ai/@stavvers/109420849116336339/embed" class="mastodon-embed" style="max-width: 100%; border: 0" width="400" height="650" allowfullscreen="allowfullscreen"></iframe>

<p>That's either a good thing from a privacy perspective or a disaster from a marketing perspective. Or a little bit of both.</p>

<p>Here's a related issue. When a user posts a link to your website on Mastodon, the server checks your page to see if there are any oEmbed tags for a rich link preview. But, at the moment, it doesn't check your website's <a href="https://developers.google.com/search/docs/crawling-indexing/robots/intro"><code>robots.txt</code></a> file - which lets it know whether it is <em>allowed</em> to scrape your content.</p>

<iframe src="https://mastodon.mit.edu/@jefftk/109416209502343043/embed" class="mastodon-embed" style="max-width: 100%; border: 0" width="400" height="400" allowfullscreen="allowfullscreen"></iframe>

<p>In the case of something like Twitter or Facebook, this is fine. If a million users post a link, the centralised social network checks the link <em>once</em> and caches the result.</p>

<p>With - potentially - thousands of distributed Mastodon sites, this presents a problem. If a popular account posts a link, their instance fetches a rich preview. Then <em>every</em> instance which has users following them also requests that URL.  Essentially, this is a DDoS attack.</p>

<h2 id="i-can-fix-you"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#i-can-fix-you">I can fix you</a></h2>

<p>So here's my thoughts on how to fix this.</p>

<p>When a user posts a link to Mastodon, their instance should send a <a href="https://indieweb.org/Webmention">WebMention</a> to the site hosting the link.  This informs the website that someone has shared their content.  Perhaps a user could adjust their privacy settings to allow or deny this.</p>

<p>The instance would check the site's <code>robots.txt</code> and, if allowed, scrape the site to see if there were any <a href="https://shkspr.mobi/blog/2022/11/is-open-graph-protocol-dead/">Open Graph Protocol</a> metadata elements on it.</p>

<p>That metadata should be <em>included</em> in the post as it is shared across the network.</p>

<p>For example, a status could look like this:</p>

<pre><code class="language-json">{
  "id": "123",
  "created_at": "2022-03-16T14:44:31.580Z",
  "in_reply_to_id": null,
  "in_reply_to_account_id": null,
  "visibility": "public",
  "language": "en",
  "uri": "https://mastodon.social/users/Edent/statuses/123",
  "content": "&lt;p&gt;Check out https://example.com/&lt;/p&gt;",
  "ogp_allowed": true,
  "ogp": {
      "og:title": "My amazing site",
      "og:image:url": "https://cdn.mastodon.social/cache/example.com/preview.jpg",
      "og:description": "A long description. Perhaps the first paragraph of the text."
      ...
   }
   ...
}
</code></pre>

<p>When a post is boosted across the network, the instances can see that there is rich metadata associated with the link. If there is an image associate with the post, that will be loaded from the cache on the original Mastodon instance - avoiding overloading the website.</p>

<p>Now, there is a flaw in this idea. A <em>malicious</em> Mastodon server could serve up a fake OGP image and description. So a link to McDonald's might display a fake image promoting Burger King.</p>

<p>To protect against this, a receiving instance could randomly or periodically check the OGP metadata that they receive. If it has been changed, they can update it.</p>

<p>Perhaps a diagram would help?</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2022/11/Mastodon-OGP-Diagram.png" alt="Crappy line drawing explaining the above." width="787" height="416" class="aligncenter size-full wp-image-44270">

<h2 id="what-other-people-say-about-the-problem"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#what-other-people-say-about-the-problem">What other people say about the problem</a></h2>

<div class="activitypub-embed u-in-reply-to h-cite"> <div class="activitypub-embed-header p-author h-card"> <img class="u-photo" src="https://asset.circumstances.run/accounts/avatars/109/330/846/558/995/088/original/9aae78ca8a673cb2.png" alt=""> <div class="activitypub-embed-header-text"> <h2 class="p-name" id="david-gerard"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#david-gerard">David Gerard</a></h2> <a href="https://circumstances.run/users/davidgerard" class="ap-account u-url">@davidgerard@circumstances.run</a> </div> </div> <div class="activitypub-embed-content"> <div class="ap-subtitle p-summary e-content"><p>yes, you should put a cache in front of a blog. nginx and wp-supercache do well. but.</p><p>mastodon's auto-DDOS feature is still obnoxious. and in a social network, technically designed in obnoxiousness is incompetent.</p><p>i realise it'd need extension of activitypub, but is anyone working on sending prerendered cards with the URL? just to save 1000 servers hammering the URL to generate their own cards locally.</p></div> </div> <div class="activitypub-embed-meta"> <a href="https://circumstances.run/users/davidgerard/statuses/109421964176048304" class="ap-stat ap-date dt-published u-in-reply-to">2022-11-28, 14:44</a> <span class="ap-stat"> <strong>7</strong> boosts </span> <span class="ap-stat"> <strong>23</strong> favorites </span> </div> </div>

<style>/** * ActivityPub embed styles. */ .activitypub-embed { background: #fff; border: 1px solid #e6e6e6; border-radius: 12px; padding: 0; max-width: 100%; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif; } .activitypub-reply-block .activitypub-embed { margin: 1em 0; } .activitypub-embed-header { padding: 15px; display: flex; align-items: center; gap: 10px; } .activitypub-embed-header img { width: 48px; height: 48px; border-radius: 50%; } .activitypub-embed-header-text { flex-grow: 1; } .activitypub-embed-header-text h2 { color: #000; font-size: 15px; font-weight: 600; margin: 0; padding: 0; } .activitypub-embed-header-text .ap-account { color: #687684; font-size: 14px; text-decoration: none; } .activitypub-embed-content { padding: 0 15px 15px; } .activitypub-embed-content .ap-title { font-size: 23px; font-weight: 600; margin: 0 0 10px; padding: 0; color: #000; } .activitypub-embed-content .ap-subtitle { font-size: 15px; color: #000; margin: 0 0 15px; } .activitypub-embed-content .ap-preview { border: 1px solid #e6e6e6; border-radius: 8px; overflow: hidden; } .activitypub-embed-content .ap-preview img { width: 100%; height: auto; display: block; } .activitypub-embed-content .ap-preview { border-radius: 8px; box-sizing: border-box; display: grid; gap: 2px; grid-template-columns: 1fr 1fr; grid-template-rows: 1fr 1fr; margin: 1em 0 0; min-height: 64px; overflow: hidden; position: relative; width: 100%; } .activitypub-embed-content .ap-preview.layout-1 { grid-template-columns: 1fr; grid-template-rows: 1fr; } .activitypub-embed-content .ap-preview.layout-2 { aspect-ratio: auto; grid-template-rows: 1fr; height: auto; } .activitypub-embed-content .ap-preview.layout-3 > img:first-child { grid-row: span 2; } .activitypub-embed-content .ap-preview img { border: 0; box-sizing: border-box; display: inline-block; height: 100%; object-fit: cover; overflow: hidden; position: relative; width: 100%; } .activitypub-embed-content .ap-preview video, .activitypub-embed-content .ap-preview audio { max-width: 100%; display: block; grid-column: 1 / span 2; } .activitypub-embed-content .ap-preview audio { width: 100%; } .activitypub-embed-content .ap-preview-text { padding: 15px; } .activitypub-embed-meta { padding: 15px; border-top: 1px solid #e6e6e6; color: #687684; font-size: 13px; display: flex; gap: 15px; } .activitypub-embed-meta .ap-stat { display: flex; align-items: center; gap: 5px; } @media only screen and (max-width: 399px) { .activitypub-embed-meta span.ap-stat { display: none !important; } } .activitypub-embed-meta a.ap-stat { color: inherit; text-decoration: none; } .activitypub-embed-meta strong { font-weight: 600; color: #000; } .activitypub-embed-meta .ap-stat-label { color: #687684; } </style>

<h2 id="feedback"><a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#feedback">Feedback?</a></h2>

<p>Is this a problem? Does this present a viable solution? Have I missed something obvious? Please leave a comment and let me know 😃</p>

<div id="footnotes" role="doc-endnotes">
<hr aria-label="Footnotes">
<ol start="0">

<li id="fn:splel">
<p>This is a spleling mistake which is part of the specification so cannot be changed.&nbsp;<a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#fnref:splel" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:spell">
<p>This one <em>is</em> spelled correctly. Which makes life confusing for all involved.&nbsp;<a href="https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/#fnref:spell" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

</ol>
</div>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=44259&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2022/11/webmentions-privacy-and-ddos-oh-my/feed/</wfw:commentRss>
			<slash:comments>16</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Is Open Graph Protocol dead?]]></title>
		<link>https://shkspr.mobi/blog/2022/11/is-open-graph-protocol-dead/</link>
					<comments>https://shkspr.mobi/blog/2022/11/is-open-graph-protocol-dead/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sun, 06 Nov 2022 12:34:49 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[meta]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[ogp]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[twitter]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=43622</guid>

					<description><![CDATA[Facebook Meta - like many other tech titans - has institutional Shiny Object Syndrome.   It goes something like this:   Launch a product to great fanfare Spend a few years hyping it as ✨the future✨ Stop answering emails and pull requests If you&#039;re lucky, announce that the product is abandoned but, more likely, just forget about it.   Open Graph Protocol (OGP) is one of those products. The val…]]></description>
										<content:encoded><![CDATA[<p><del>Facebook</del> Meta - like many other tech titans - has institutional <a href="https://en.wikipedia.org/wiki/Shiny_object_syndrome">Shiny Object Syndrome</a>.   It goes something like this:</p>

<ol>
<li>Launch a product to great fanfare</li>
<li>Spend a few years hyping it as ✨the future✨</li>
<li>Stop answering emails and pull requests</li>
<li>If you're lucky, announce that the product is abandoned but, more likely, just forget about it.</li>
</ol>

<p>Open Graph Protocol (OGP) is one of those products. The value-proposition is simple.</p>

<ul>
<li>It's <em>hard</em> for computers to pick out the main headline, image, and other data from a complex web page.</li>
<li>Therefore, let's encourage websites to include metadata which tells our services what they should look at!</li>
</ul>

<p>OGP works pretty well! When you share a link on Facebook, or Twitter, or Telegram - those services load the website in the background, look for OGP metadata, and display a friendly snippet.</p>

<p><del>Facebook</del> Meta were the driving force behind OGP - and have now left it to fester.</p>

<ul>
<li>The website - <a href="https://ogp.me/"></a><a href="https://ogp.me/">https://ogp.me/</a> - still works.</li>
<li>But the <a href="https://www.facebook.com/groups/opengraph/">Facebook OGP  Discussion Group</a> is now full of spam.</li>
<li>The <a href="https://groups.google.com/g/open-graph-protocol?pli=1">Developer Mailing List</a> is broken.</li>
<li>The <a href="https://developers.google.com/+/web/+1button/#plus-snippet">Google Documentation</a> links to a dead Google+ page.</li>
<li>And the <a href="https://github.com/facebookarchive/open-graph-protocol">GitHub Page</a> has been archived.</li>
</ul>

<h2 id="is-ogp-finished"><a href="https://shkspr.mobi/blog/2022/11/is-open-graph-protocol-dead/#is-ogp-finished">Is OGP finished?</a></h2>

<p>And, that might be fine. <del>Facebook</del> Meta are a small company with limited resources. They can't afford to fund standards work indefinitely. And, anyway, OGP is complete, right? It has all the tags that anyone could ever possibly want. Why does it need any improving?</p>

<p>Well, that's not the case. We know, for example, that Twitter have created <a href="https://developer.twitter.com/en/docs/twitter-for-websites/cards/overview/markup">their own proprietary OGP-like meta tags</a>. Similarly, <a href="https://help.pinterest.com/en-gb/business/article/rich-pins">Pinterest have their own as well</a>. And even <a href="https://search.google.com/test/rich-results">Google are going their own way with Rich Snippets</a>.</p>

<p>This is annoying for developers. Now we have to write <em>multiple</em> different bits of metadata if we want our links to be supported on all platforms.</p>

<p>Standards work is never "finished". Developers <em>want</em> to add new features. Users <em>want</em> to interact with new forms of content.</p>

<p>Tomorrow someone is going to invent a way to share smells over the Internet. How does that get represented in an Open Graph Protocol compliant manner?</p>

<p><code>&lt;meta property="twitter:olfactory" content="C₃H₆S"&gt;</code> or
<code>&lt;meta property="facebook:nose"     content="InChIKey/MWOOGOJBHIARFG-UHFFFAOYSA-N"&gt;</code> or
<code>&lt;meta property="og:smell"          content="pumpkin spice"&gt;</code> or...</p>

<p>We know from bitter experience that having several mutually incompatible ways to implement something is a nightmare for developers and provides a poor user-experience.</p>

<p>So we create standards bodies. They're not perfect, but a group of interested folks can do the hard work to try and satisfy oppositional stakeholders.</p>

<p>This is my plea to <del>Facebook</del> Meta. If you're no longer interested in improving OGP, OK. You do you. But hand it over to people who want to keep this going. Maybe it's the <a href="https://www.w3.org/">W3C</a>, or <a href="https://indieweb.org/The-Open-Graph-protocol">IndieWeb</a>, or <a href="https://schema.org">Schema.org</a> or <em>someone</em>.  Hell, I'm not busy, I'll take it on.</p>

<p>Remember, if you love something, let it go.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=43622&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2022/11/is-open-graph-protocol-dead/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Semantic Comments for WordPress]]></title>
		<link>https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/</link>
					<comments>https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 28 Apr 2022 11:34:13 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[meta]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[schema.org]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[WordPress]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=42412</guid>

					<description><![CDATA[As regular readers will know, I love adding Semantic things to my blog.  The standard WordPress comments HTML isn&#039;t very semantic - so I thought I&#039;d change that. Here&#039;s some code which you can add to your blog&#039;s theme - an an explanation of how it works.  The aim is to end up with some HTML which looks like this (edited for brevity):  &#60;li itemscope itemtype=&#34;https://schema.org/Comment&#34;…]]></description>
										<content:encoded><![CDATA[<p>As regular readers will know, I love <a href="https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/">adding Semantic things to my blog</a>.</p>

<p>The standard WordPress comments HTML isn't very semantic - so I thought I'd change that. Here's some code which you can add to your blog's theme - an an explanation of how it works.</p>

<p>The aim is to end up with some HTML which looks like this (edited for brevity):</p>

<pre><code class="language-HTML5">&lt;li itemscope itemtype="https://schema.org/Comment" itemid="#246827"&gt;
    &lt;article&gt;
        &lt;time datetime="2022-04-12T10:22:32+01:00" itemprop="dateCreated"&gt;
            &lt;a href="https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/#comment-246827" itemprop='url'&gt;22-04-12 10:22&lt;/a&gt;
        &lt;/time&gt;
        &lt;div itemprop="https://schema.org/author" itemscope itemtype="https://schema.org/Person"&gt;
            &lt;img itemprop="image" alt='' src='logo.jpg' height='64' width='64'/&gt;
            &lt;span itemprop="name"&gt;
                &lt;a itemprop="url" href="https://twitter.com/example"&gt;Commenter's Name&lt;/a&gt;&lt;/span&gt; says:
        &lt;/div&gt;
        &lt;div itemprop="text"&gt;This is the text of my comment&lt;/code&gt;&lt;/div&gt;
    &lt;/article&gt;
&lt;/li&gt;
</code></pre>

<p>Which will be interpreted as:
<img src="https://shkspr.mobi/blog/wp-content/uploads/2022/04/Schema-Markup-validator.png" alt="A schema tree showing all the properties." width="916" height="310" class="aligncenter size-full wp-image-42415"></p>

<p>This adds <code>&lt;time&gt;</code> elements as well as Schema.org microdata.</p>

<h2 id="howto"><a href="https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/#howto">Howto</a></h2>

<p>In <code>comments.php</code> you'll see something like this:</p>

<pre><code class="language-php">&lt;ol class="comment-list"&gt;
    &lt;?php
        wp_list_comments( array(
            'style'       =&gt; 'ol',
            'short_ping'  =&gt; true,
            'avatar_size' =&gt; 64,
        ) );
    ?&gt;
&lt;/ol&gt;
</code></pre>

<p>You need to add a new <code>callback</code>. In this case, I've called it <code>my_comments_walker</code>:</p>

<pre><code class="language-php">&lt;ol class="comment-list"&gt;
    &lt;?php
        wp_list_comments( array(
            'style'       =&gt; 'ol',
            'short_ping'  =&gt; true,
            'avatar_size' =&gt; 64,
            'callback' =&gt; 'my_comments_walker',
        ) );
    ?&gt;
&lt;/ol&gt;
</code></pre>

<p>You can read more about <a href="https://developer.wordpress.org/reference/classes/walker/">WordPress Walkers on their documentation page</a>.</p>

<p>Now that's done, you need to create a function in your <code>functions.php</code> file.  I added this to the end of my file:</p>

<pre><code class="language-php">function my_comments_walker() {

    //  Basic comment data
    $comment_id          = get_comment_id();
    $comment             = get_comment( $comment_id );

    //  Date the comment was submitted
    $comment_date        = get_comment_date( "c" );
    //  In slightly more human-readable format
    $comment_date_human  = get_comment_date( "y-m-d H:i" );

    //  Author Details
    $comment_author      = get_comment_author();

    //  Author's URl if they've added one
    $comment_author_url  = get_comment_author_url();

    //  If there's an Author URl, link it
    if ($comment_author_url != null) {
        $comment_author_name = "&lt;a itemprop='url' href='{$comment_author_url}' rel='external nofollow ugc' class='url'&gt;{$comment_author}&lt;/a&gt;";
    } else {
        $comment_author_name = "{$comment_author}";
    }

    //  Provide a link to the comment anchor
    $comment_url_link = "&lt;a href="https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/#comment-$comment_id" itemprop='url'&gt;{$comment_date_human}&lt;/a&gt;";

    //  Author's Avatar based on ID
    //  As per https://developer.wordpress.org/reference/functions/get_avatar/ both alt &amp; default must be set
    $gravatar            = get_avatar( $comment, 64, "", "", array('extra_attr' =&gt; 'itemprop="image"') );

    //  Comment needs newlines and links added
    $comment_text        = apply_filters( 'comment_text', get_comment_text(), $comment);


    //  The comment may have various classes. They are stored as an array
    $comment_classes     = get_comment_class();
    $comment_classes_text = "";
    foreach( $comment_classes as $class ) {
        $comment_classes_text .= $class . " ";
    }
    $comment_classes_text = trim($comment_classes_text);

    //  Link to open the reply box
    $comment_reply_link = get_comment_reply_link( [
                    'depth'     =&gt; 20,
                    'max_depth' =&gt; 100,
                    'before'    =&gt; '&lt;div class="reply"&gt;',
                    'after'     =&gt; '&lt;/div&gt;'
            ] );

    //  Write the comment HTML. No need for a closing &lt;/li&gt; as WP handles that.
    echo &lt;&lt;&lt;EOT
    &lt;li id="comment-$comment_id" itemscope itemtype="https://schema.org/Comment" itemid="#comment-$comment_id" class="$comment_classes_text"&gt;
        &lt;article class="comment-body" id="div-comment-$comment_id"&gt;
            &lt;time datetime="$comment_date" class="comment-meta commentmetadata" itemprop="dateCreated"&gt;
                $comment_url_link
            &lt;/time&gt;
            &lt;div class="comment-author vcard" itemprop="https://schema.org/author" itemscope itemtype="https://schema.org/Person"&gt;
                $gravatar
                &lt;span class="fn" itemprop="name"&gt;$comment_author_name&lt;/span&gt; &lt;span class="says"&gt;says:&lt;/span&gt;
            &lt;/div&gt;
            &lt;div itemprop="text" class="comment-text"&gt;$comment_text&lt;/div&gt;
            $comment_reply_link
        &lt;/article&gt;
    EOT;
}
</code></pre>

<p>There are a few extra classes and spans which I use. You can remove them if you like.</p>

<p>And that's it. All your comments will have individual semantic metadata. If you think anything else should be included, please let me know.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=42412&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2022/04/semantic-comments-for-wordpress/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[How to add ISSN metadata to a web page]]></title>
		<link>https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/</link>
					<comments>https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 17 Sep 2021 11:09:05 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[schema.org]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=40341</guid>

					<description><![CDATA[Inspired by John Hoare at the Dirty Feed blog - I&#039;ve asked the British Library to assign my blog an International Standard Serial Number (ISSN).  An ISSN is an 8-digit code used to identify newspapers, journals, magazines and periodicals of all kinds and on all media–print and electronic.    Why?  Shut up.  OK. It turns out that lots of people cite my blog in academic papers - so I wanted to make …]]></description>
										<content:encoded><![CDATA[<p>Inspired by John Hoare at the <a href="https://www.dirtyfeed.org/">Dirty Feed blog</a> - I've asked the British Library to assign my blog an <a href="https://www.issn.org/understanding-the-issn/what-is-an-issn/">International Standard Serial Number</a> (ISSN).</p>

<blockquote><p>An ISSN is an 8-digit code used to identify newspapers, journals, magazines and periodicals of all kinds and on all media–print and electronic.</p></blockquote>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2021/09/Screenshot-2021-09-07-at-10-21-40-Application-for-an-ISSN-terence-eden-shkspr-mobi-Shkspr-mobi-Mail.png" alt="creenshot of an email from the British Library. Dear Terence Eden INTERNATIONAL STANDARD SERIAL NUMBER (ISSN) Thank you for your recent enquiry, we have assigned ISSN to the following publication(s): Terence Eden’s blog ISSN 2753-1570 ." width="867" height="310" class="aligncenter size-full wp-image-40345">

<h2 id="why"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#why">Why?</a></h2>

<p>Shut up.</p>

<p>OK. It turns out that lots of people <a href="https://shkspr.mobi/blog/citations">cite my blog in academic papers</a> - so I wanted to make it slightly easier for scholars of the future to use metadata to trace my vast influence on Human civilisation.</p>

<h2 id="how"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#how">How?</a></h2>

<p>I filled in <a href="https://www.bl.uk/help/get-an-isbn-or-issn-for-your-publication">a form on the British Library website</a>. Didn't cost me a penny. Was pretty quick!</p>

<h2 id="metadata"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#metadata">Metadata</a></h2>

<p>I can stick a bit of text at the bottom of each page with the ISSN - but that doesn't make it easily discoverable by automated tools. How can I make an ISSN machine readable?  There are a few ways.</p>

<h3 id="meta-elements"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#meta-elements">Meta Elements</a></h3>

<p>There are <a href="https://html.spec.whatwg.org/multipage/semantics.html#standard-metadata-names">a limited list of official <code>&lt;meta&gt;</code> names</a>. These are extensible, and <a href="https://scholar.google.com/intl/en/scholar/inclusion.html#indexing">Google Scholar recommends <code>citation_issn</code></a>. Which is as simple as adding the following to your page's <code>&lt;head&gt;</code>:</p>

<pre><code class="language-html">&lt;meta name="citation_issn" content="1234-5678"&gt;
</code></pre>

<p>There alternatives though.</p>

<h3 id="schema-org"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#schema-org">Schema.org</a></h3>

<p>In recent years, <a href="https://schema.org/">Schema.org</a> has become the dominant form for representing metadata on the web. There are two ways you can implement it:</p>

<h4 id="json-ld"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#json-ld">JSON-LD</a></h4>

<p>JSON Linked Data involves adding a scrap of JavaScript to your HTML, like this:</p>

<pre><code class="language-HTML">&lt;script type="application/ld+json"&gt;
{
  "@context": "https://schema.org",
  "@type": "Blog",
  "issn": "1234-5678"
}
&lt;/script&gt;
</code></pre>

<p>If you don't want to add a separate script, you can add the data inline using...</p>

<h4 id="microdata"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#microdata">Microdata</a></h4>

<p>The <a href="https://html.spec.whatwg.org/multipage/microdata.html">microdata specification</a> uses the exact same data as Schema.org - but allows you to add the data directly into the web page like this:</p>

<pre><code class="language-_">&lt;body itemscope itemtype="https://schema.org/Blog"&gt;
   ...
   ISSN &lt;span itemprop="issn"&gt;1234-5678&lt;/span&gt;
</code></pre>

<p>That's probably the <em>easiest</em> way to do it.</p>

<h3 id="links"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#links">Links</a></h3>

<p>The ISSN registry allows you to look up any ISSN with a simple URL. Mine is at <a href="https://portal.issn.org/resource/ISSN/2753-1570"></a><a href="https://portal.issn.org/resource/ISSN/2753-1570">https://portal.issn.org/resource/ISSN/2753-1570</a>.</p>

<h2 id="belt-and-braces"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#belt-and-braces">Belt and braces</a></h2>

<p>So, this is what I've ended up doing - cramming everything in all at once.</p>

<pre><code class="language-html">&lt;head&gt;
   ...
   &lt;meta name="citation_issn" content="1234-5678"&gt;
&lt;/head&gt;
&lt;body itemscope itemtype="https://schema.org/Blog"&gt;
   ...
   ISSN &lt;a href="https://portal.issn.org/resource/ISSN/1234-5678"&gt;&lt;span itemprop="issn"&gt;1234-5678&lt;/span&gt;&lt;/a&gt;
</code></pre>

<h2 id="any-other-ways"><a href="https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/#any-other-ways">Any other ways?</a></h2>

<p>What am I missing? Can someone smarter than I tell me that there's an easier / better / more interoperable way to do this?</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=40341&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2021/09/how-to-add-issn-metadata-to-a-web-page/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Reducing GPS accuracy in photos]]></title>
		<link>https://shkspr.mobi/blog/2020/10/reducing-gps-accuracy-in-photos/</link>
					<comments>https://shkspr.mobi/blog/2020/10/reducing-gps-accuracy-in-photos/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 01 Oct 2020 11:56:50 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[exif]]></category>
		<category><![CDATA[gps]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[photography]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=36818</guid>

					<description><![CDATA[Here&#039;s a quick one-liner to reduce the precision of location stored in a photo&#039;s EXIF metadata:  exiftool -c &#34;%.2f&#34; -TagsFromFile @ -GPSLatitude -GPSLongitude photo.jpg   (Thanks to the EXIFtool Forum for their help.)  Why is this useful?  Modern phones automatically attach a GPS location to every photo you take. GPS resolution is around 10 metres. When you share your photos, you&#039;re often sharing …]]></description>
										<content:encoded><![CDATA[<p>Here's a quick one-liner to reduce the precision of location stored in a photo's EXIF metadata:</p>

<pre><code class="language-bash">exiftool -c "%.2f" -TagsFromFile @ -GPSLatitude -GPSLongitude photo.jpg
</code></pre>

<p>(Thanks to the <a href="https://exiftool.org/forum/index.php?topic=11651.new;topicseen#new">EXIFtool Forum</a> for their help.)</p>

<h2 id="why-is-this-useful"><a href="https://shkspr.mobi/blog/2020/10/reducing-gps-accuracy-in-photos/#why-is-this-useful">Why is this useful?</a></h2>

<p>Modern phones automatically attach a GPS location to every photo you take. GPS resolution is around 10 metres. When you share your photos, you're often sharing your <em>precise</em> location.</p>

<p>I wanted to upload some photos to the Wikimedia Commons of <a href="https://shkspr.mobi/blog/2020/09/opening-a-rediffusion-junction-box/">an interesting junction box installed in our home</a>.  I didn't want my home location stored on the Internet forever - but I thought it would be useful to include a <em>rough</em> location.</p>

<p>The above command takes a location of <code>51.123456,0.987654</code> and returns <code>51.12,0.98</code>.  That's good enough to roughly show the location, without revealing it exactly.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=36818&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2020/10/reducing-gps-accuracy-in-photos/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Adding Semantic Reviews / Rich Snippets to your WordPress Site]]></title>
		<link>https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/</link>
					<comments>https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/#respond</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sun, 12 Jul 2020 11:50:01 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[schema.org]]></category>
		<category><![CDATA[WordPress]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=35586</guid>

					<description><![CDATA[This is a real &#34;scratch my own itch&#34; post. I want to add Schema.org semantic metadata to the book reviews I write on my blog. This will enable &#34;rich snippets&#34; in search engines.  There are loads of WordPress plugins which do this. But where&#039;s the fun in that?! So here&#039;s how I quickly built it into my open source blog theme.  Screen options  First, let&#039;s add some screen options to the WordPress…]]></description>
										<content:encoded><![CDATA[<p>This is a real "scratch my own itch" post. I want to add <a href="https://schema.org/Review">Schema.org semantic metadata</a> to the book reviews I write on my blog. This will enable "rich snippets" in search engines.</p>

<p>There are <em>loads</em> of WordPress plugins which do this. But where's the fun in that?! So here's how I quickly built it into my <a href="https://gitlab.com/edent/blog-theme">open source blog theme</a>.</p>

<h2 id="screen-options"><a href="https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/#screen-options">Screen options</a></h2>

<p>First, let's add some <a href="https://make.wordpress.org/support/user-manual/getting-to-know-wordpress/screen-options/">screen options</a> to the WordPress editor screen.</p>

<p>This is what it will look like when done:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/06/Screenshot_2020-06-29-Edit-Post-‹-Terence-Eden’s-Blog-—-WordPress-fs8.png" alt="Simple interface for adding details." width="513" height="428" class="aligncenter size-full wp-image-35589">

<p>This is how to add a <a href="https://developer.wordpress.org/plugins/metadata/custom-meta-boxes/#adding-meta-boxes">custom metabox</a> to the editor screen:</p>

<pre><code class="language-php">//  Place this in functions.php
//  Display the box
function edent_add_review_custom_box()
{
   $screens = ['post'];
   foreach ($screens as $screen) {
      add_meta_box(
         'edent_review_box_id', // Unique ID
         'Book Review Metadata',    // Box title
         'edent_review_box_html',// Content callback, must be of type callable
         $screen                 // Post type
       );
   }
}
add_action('add_meta_boxes', 'edent_add_review_custom_box');
</code></pre>

<p>The contents of the box are bog standard HTML</p>

<pre><code class="language-php">//  Place this in functions.php
//  HTML for the box
function edent_review_box_html($post)
{
    $review_data = get_post_meta(get_the_ID(), "_edent_book_review_meta_key", true);
    echo "&lt;table&gt;";

    $checked = "";
    if ($review_data["review"] == "true") {
        $checked = "checked";
    }
    echo "&lt;tr&gt;&lt;td&gt;&lt;label for='edent_book_review'&gt;Embed Book Review:&lt;/label&gt;&lt;/td&gt;&lt;td&gt;&lt;input type=checkbox id=edent_book_review name=edent_book_review[review] value=true {$checked}&gt;&lt;/tr&gt;";

    echo "&lt;tr&gt;&lt;td&gt;&lt;label for='edent_rating'&gt;Rating:&lt;/label&gt;&lt;/td&gt;&lt;td&gt;&lt;input type=range id=edent_rating name=edent_book_review[rating] min=0 max=5 step=0.5 value='". esc_html($review_data["rating"]) ."'&gt;&lt;/tr&gt;";

    echo "&lt;tr&gt;&lt;td&gt;&lt;label for=edent_isbn &gt;ISBN:&lt;/label&gt;&lt;/td&gt;&lt;td&gt;&lt;input name=edent_book_review[isbn]  id=edent_isbn type=text value='" . esc_html($review_data["isbn"]) . "' autocomplete=off&gt;&lt;/tr&gt;";

    echo "&lt;/table&gt;";
}
</code></pre>

<p>Done! We now have a box for metadata. That data will be <code>POST</code>ed every time the blogpost is saved. But where do the data go?</p>

<h2 id="saving-data"><a href="https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/#saving-data">Saving data</a></h2>

<p>This function is added every time the blogpost is saved. If the checkbox has been ticked, the metadata are saved to the database. If the checkbox is unticked, the metadata are deleted.</p>

<pre><code class="language-php">//  Place this in functions.php
//  Save the box
function edent_review_save_postdata($post_id)
{
   if (array_key_exists('edent_book_review', $_POST)) {
        if ($_POST['edent_book_review']["review"] == "true") {
            update_post_meta(
                $post_id,
                '_edent_book_review_meta_key',
                $_POST['edent_book_review']
            );
        } else {
            delete_post_meta(
                $post_id,
                '_edent_book_review_meta_key'
            );
        }
    }
}
add_action('save_post', 'edent_review_save_postdata');
</code></pre>

<p>Nice! But how do we get the data back out again?</p>

<h2 id="retrieving-the-data"><a href="https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/#retrieving-the-data">Retrieving the data</a></h2>

<p>We can use the <a href="https://developer.wordpress.org/reference/functions/get_post_meta/"><code>get_post_meta()</code> function</a> to get all the metadata associated with a blog entry.  We can then turn it into a Schema.org structured metadata entry.</p>

<pre><code class="language-php">function edent_book_review_display($post_id){
    // https://developer.wordpress.org/reference/functions/the_meta/
    $review_data = get_post_meta($post_id, "_edent_book_review_meta_key", true);
    if ($review_data["review"] == "true")
    {
        $blog_author_data = get_the_author_meta();

        $schema_review = array (
            '@context' =&gt; 'https://schema.org',
            '@type'    =&gt; 'Review',
            'author' =&gt;
            array (
                '@type' =&gt; 'Person',
                'name'  =&gt; get_the_author_meta("user_firstname") . " " . get_the_author_meta("user_lastname"),
                'sameAs' =&gt;
                array (
                    0 =&gt; get_the_author_meta("user_url"),
                ),
            ),
            'url' =&gt; get_permalink(),
            'datePublished' =&gt; get_the_date('c'),
            'publisher' =&gt;
            array (
                '@type'  =&gt; 'Organization',
                'name'   =&gt; get_bloginfo("name"),
                'sameAs' =&gt; get_bloginfo("url"),
            ),
            'description' =&gt; mb_substr(get_the_excerpt(), 0, 198),
            'inLanguage'  =&gt; get_bloginfo("language"),
            'itemReviewed' =&gt;
            array (
                '@type'  =&gt; 'Book',
                'name'   =&gt; $review_data["title"],
                'isbn'   =&gt; $review_data["isbn"],
                'sameAs' =&gt; $review_data["book_url"],
                'author' =&gt;
                array (
                    '@type'  =&gt; 'Person',
                    'name'   =&gt; $review_data["author"],
                    'sameAs' =&gt; $review_data["author_url"],
                ),
            'datePublished' =&gt; $review_data["book_date"],
            ),
            'reviewRating' =&gt;
            array (
                '@type' =&gt; 'Rating',
                'worstRating' =&gt; 0,
                'bestRating'  =&gt; 5,
                'ratingValue' =&gt; $review_data["rating"],
            ),
            'thumbnailUrl' =&gt; get_the_post_thumbnail_url(),
        );
        echo '&lt;script type="application/ld+json"&gt;' . json_encode($schema_review) . '&lt;/script&gt;';

        echo "&lt;div class='edent-review' style='clear:both;'&gt;";
        if (isset($review_data["rating"])) {
            echo "&lt;span class='edent-rating-stars' style='font-size:2em;color:yellow;background-color:#13131380;'&gt;";
            $full = floor($review_data["rating"]);
            $half = 0;
            if ($review_data["rating"] - $full == 0.5)
            {
                $half = 1;
            }

            $empty = 5 - $half - $full;

            for ($i=0; $i &lt; $full ; $i++) {
                echo "★";
            }
            if ($half == 1)
            {
                echo "⯪";
            }
            for ($i=0; $i &lt; $empty ; $i++) {
                echo "☆";
            }
            echo "&lt;/span&gt;";
        }
        echo "&lt;ul&gt;";
        if ($review_data["amazon_url"] != "") {
            echo "&lt;li&gt;&lt;a href='{$review_data["amazon_url"]}'&gt;Buy it on Amazon&lt;/a&gt;&lt;/li&gt;";
        }
        if ($review_data["author_url"] != "") {
            echo "&lt;li&gt;&lt;a href='{$review_data["author_url"]}'&gt;Author's homepage&lt;/a&gt;&lt;/li&gt;";
        }
        if ($review_data["book_url"] != "") {
            echo "&lt;li&gt;&lt;a href='{$review_data["book_url"]}'&gt;Publisher's details&lt;/a&gt;&lt;/li&gt;";
        }
        echo "&lt;/ul&gt;";
    }
    echo "&lt;/div&gt;";
}
</code></pre>

<p>In <code>index.php</code>, after <code>the_content();</code> add:</p>

<pre><code class="language-php">edent_book_review_display(get_the_ID());
</code></pre>

<p>Then, on the website, it will look something like this:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/06/Screenshot_2020-06-29-Review-The-House-of-Shattered-Wings-–-Aliette-de-Bodard-fs8.png" alt="Star rating and links displayed under a book cover." width="451" height="446" class="aligncenter size-full wp-image-35588">

<p>Note the use of the <a href="http://www.righto.com/2016/10/inspired-by-hn-comment-four-half-star.html">Unicode Half Star</a> for the ratings.</p>

<p>The source code of the site shows the output of the JSON LD:
<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/06/Screenshot-from-2020-06-29-08-49-36-fs8.png" alt="Screenshot of JSON code in a web page." width="598" height="513" class="aligncenter size-full wp-image-35590"></p>

<p>When run through a <a href="https://search.google.com/structured-data/testing-tool/u/0/">Structured Data Testing Tool</a>, it shows as a valid review:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/06/Screenshot_2020-06-29-Structured-Data-Testing-Tool-fs8.png" alt="Results of a Structured Data Test" width="884" height="944" class="aligncenter size-full wp-image-35587">

<p>And this means, when search engines access your blog, they will display rich snippets based on the semantic metadata.</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/07/Screenshot_2020-07-12-house-of-shattered-wings-shkspr-Google-Search.png" alt="Rich result from Google showing the star rating." width="773" height="170" class="aligncenter size-full wp-image-35923">

<p>You can <a href="https://shkspr.mobi/blog/2020/06/review-the-house-of-shattered-wings-aliette-de-bodard/">see the final blog post</a> to see how it works.</p>

<h2 id="todo"><a href="https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/#todo">ToDo</a></h2>

<p>My code is horrible and hasn't been tested, validated, or sanitised. It's only for my own blog, and I'm unlikely to hack myself, but that needs fixing.</p>

<p>I want to add review metadata for movies, games, and gadgets. That will either require multiple boxes, or a clever way to only show the necessary fields.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=35586&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2020/07/adding-semantic-reviews-rich-snippets-to-your-wordpress-site/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Removing default metadata from .opus files]]></title>
		<link>https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/</link>
					<comments>https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 24 Apr 2020 11:03:00 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[audio]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[ogg]]></category>
		<category><![CDATA[opus]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=34763</guid>

					<description><![CDATA[I&#039;m trying to create some ridiculously tiny audio files. The sort where every single byte matters.  I&#039;ve encoded a small sample. But the opusenc tool automatically adds metadata - even if you don&#039;t specify any.  Using the amazing Mutagen Python library I was able to completely strip out all the metadata!  import mutagen mutagen.File(&#34;example.opus&#34;).delete()   It edits the file immediately - so be …]]></description>
										<content:encoded><![CDATA[<p>I'm trying to create some ridiculously tiny audio files. The sort where every single byte matters.</p>

<p>I've encoded a small sample. But the <code>opusenc</code> tool automatically adds metadata - even if you don't specify any.</p>

<p>Using the amazing <a href="https://mutagen.readthedocs.io/en/latest/">Mutagen Python library</a> I was able to completely strip out <em>all</em> the metadata!</p>

<pre><code class="language-python">import mutagen
mutagen.File("example.opus").delete()
</code></pre>

<p>It edits the file <strong>immediately</strong> - so be careful!</p>

<p>But what is it actually doing? I wanted to understand a bit more - so let's go hex diving!</p>

<h2 id="what-the-user-sees"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#what-the-user-sees">What the user sees</a></h2>

<p>Running <code>opusinfo example.opus</code> gives:</p>

<pre><code class="language-_">New logical stream (#1, serial: 03fe3cc9): type opus
Encoded with libopus 1.3.1, libopusenc 0.2.1
User comments section follows...
    ENCODER=opusenc from opus-tools 0.2
    ENCODER_OPTIONS=--bitrate 6 --comp 10 --framesize 60 --padding 0
Opus stream 1:
    ...
Logical stream 1 ended
</code></pre>

<p>There are two "mandatory" comments. The ENCODER and the ENCODER_OPTIONS.
I can't find a way to stop those being generated by <code>opusenc</code>.</p>

<p>The <a href="https://opus-codec.org/docs/opusfile_api-0.7/structOpusTags.html">Opus File API</a> gives some idea about the binary structure of the file.</p>

<p>But the real magic happens in the <a href="https://tools.ietf.org/html/rfc7845.html#section-5.2">Opus Forumat Specification RFC</a>. It details the header format in 32 bit clumps.</p>

<pre><code class="language-_">      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |      'O'      |      'p'      |      'u'      |      's'      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |      'T'      |      'a'      |      'g'      |      's'      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                     Vendor String Length                      |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     :                        Vendor String...                       :
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                   User Comment List Length                    |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                 User Comment #0 String Length                 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                                                               |
     :                   User Comment #0 String...                   :
     |                                                               |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                 User Comment #1 String Length                 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     :                                                               :

</code></pre>

<p>Let's take a look at our file in binary, jumping straight to the comment section.</p>

<pre><code class="language-_">0000004b: 4f70 7573  Opus
0000004f: 5461 6773  Tags
</code></pre>

<p>Starts as expected. Next is the Vendor String Length</p>

<pre><code class="language-_">00000053: 1f00 0000  ....
</code></pre>

<p>0x1f is 31 bytes. This is a 32 bit, unsigned, <a href="https://en.wikipedia.org/wiki/Endianness">little endian number</a>. Hence it is written as <code>1f00</code> which becomes <code>00001f</code>.</p>

<pre><code class="language-_">00000057: 6c69 626f  libo
0000005b: 7075 7320  pus 
0000005f: 312e 332e  1.3.
00000063: 312c 206c  1, l
00000067: 6962 6f70  ibop
0000006b: 7573 656e  usen
0000006f: 6320 302e  c 0.
00000073: 322e 31    2.1
</code></pre>

<p>According to the spec, no terminating null octet is necessary. So the next bytes are the User Comment List Length.  Continuing on from the previous line:</p>

<pre><code class="language-_">00000073:        02     .
00000077: 0000 00    ...
</code></pre>

<p>There are two comments (again, 32 bit little endian).</p>

<blockquote><p>This field indicates the number of user-supplied comments.  It MAY indicate there are zero user-supplied comments, in which case there are no additional fields in the packet.</p></blockquote>

<p>This means we <em>can</em> have an empty comment section! This is what you get by default:</p>

<pre><code class="language-_">00000077:        23  ...#
0000007b: 0000 00    ...
</code></pre>

<p>First string length is 0x23 = 35 bytes long. Again, little endian.</p>

<pre><code class="language-_">0000007e: 454e 434f  ENCO
00000082: 4445 523d  DER=
00000086: 6f70 7573  opus
0000008a: 656e 6320  enc 
0000008e: 6672 6f6d  from2
00000092: 206f 7075   opu
00000096: 732d 746f  s-to
0000009a: 6f6c 7320  ols 
0000009e: 302e 3240  0.2@
</code></pre>

<p>After exactly 35 bytes, we get our next little endian number 0x40 = 64.</p>

<pre><code class="language-_">000000a1: 4000 0000  @...
000000a5: 454e 434f  ENCO
000000a9: 4445 525f  DER_
000000ad: 4f50 5449  OPTI
000000b1: 4f4e 533d  ONS=
000000b5: 2d2d 6269  --bi
000000b9: 7472 6174  trat
000000bd: 6520 3620  e 6 
000000c1: 2d2d 636f  --co
000000c5: 6d70 2031  mp 1
000000c9: 3020 2d2d  0 --
000000cd: 6672 616d  fram
000000d1: 6573 697a  esiz
000000d5: 6520 3630  e 60
000000d9: 202d 2d70   --p
000000dd: 6164 6469  addi
000000e1: 6e67 2030  ng 0
</code></pre>

<p>And that's the end of the comment section!</p>

<h2 id="manually-editing-the-file"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#manually-editing-the-file">Manually editing the file</a></h2>

<p>I started by setting the User Comment List Length to zero, and removing all the subsequent comment data. That didn't work. <code>opusinfo</code> gave the following errors:</p>

<pre><code class="language-_">WARNING: Hole in data (28 bytes) found at approximate offset 1492 bytes. Corrupted Ogg.
WARNING: Hole in data (51 bytes) found at approximate offset 1492 bytes. Corrupted Ogg.
WARNING: sequence number gap in stream 1. Got page 2 when expecting page 1. Indicates missing data.
WARNING: discontinuity in stream (1)
</code></pre>

<p><a href="https://tools.ietf.org/html/rfc7845.html#section-3">Back to the documentation</a>!</p>

<blockquote><p>An Ogg Opus stream is organized as follows (see Figure 1 for an example).</p></blockquote>

<pre><code class="language-_">        Page 0         Pages 1 ... n        Pages (n+1) ...
     +------------+ +---+ +---+ ... +---+ +-----------+ +---------+ +--
     |            | |   | |   |     |   | |           | |         | |
     |+----------+| |+-----------------+| |+-------------------+ +-----
     |||ID Header|| ||  Comment Header || ||Audio Data Packet 1| | ...
     |+----------+| |+-----------------+| |+-------------------+ +-----
     |            | |   | |   |     |   | |           | |         | |
     +------------+ +---+ +---+ ... +---+ +-----------+ +---------+ +--
     ^      ^                           ^
     |      |                           |
     |      |                           Mandatory Page Break
     |      |
     |      ID header is contained on a single page
     |
     'Beginning Of Stream'

    Figure 1: Example Packet Organization for a Logical Ogg Opus Stream
</code></pre>

<blockquote><p>There are two mandatory header packets.  The first packet in the logical Ogg bitstream MUST contain the identification (ID) header, which uniquely identifies a stream as Opus audio.  The format of this header is defined in Section 5.1.  It is placed alone (without any other packet data) on the first page of the logical Ogg bitstream and completes on that page.  This page has its 'beginning of stream' flag set.</p>

<p>The second packet in the logical Ogg bitstream MUST contain the comment header, which contains user-supplied metadata.  The format of this header is defined in Section 5.2.  It MAY span multiple pages, beginning on the second page of the logical stream.  However many pages it spans, the comment header packet MUST finish the page on which it completes.</p></blockquote>

<p>I tried saying there was one comment, with a length of zero and a null comment. That didn't work either.</p>

<p>I think this is because <em>before</em> the start of the comment header there is something describing how long the packet will be.</p>

<h2 id="headers"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#headers">Headers</a></h2>

<p>Here are the headers from the original file, and the one stripped by Mutagen.</p>

<h3 id="original-header"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#original-header">Original Header</a></h3>

<pre><code class="language-_">00000000: 4f67 6753 0002 0000  OggS....
00000008: 0000 0000 0000 c93c  .......&lt;
00000010: fe03 0000 0000 f90e  ........
00000018: f775 0113 4f70 7573  .u..Opus
00000020: 4865 6164 0101 3801  Head..8.
00000028: 80bb 0000 0000 004f  .......O
00000030: 6767 5300 0000 0000  ggS.....
00000038: 0000 0000 00c9 3cfe  ......&lt;.
00000040: 0301 0000 0035 dfaf  .....5..
00000048: 0601 9a4f 7075 7354  ...OpusT
00000050: 6167 731f 0000 006c  ags....l
00000058: 6962 6f70 7573 2031  ibopus 1
</code></pre>

<h3 id="stripped-header"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#stripped-header">Stripped Header</a></h3>

<pre><code class="language-_">00000000: 4f67 6753 0002 0000  OggS....
00000008: 0000 0000 0000 c93c  .......&lt;
00000010: fe03 0000 0000 f90e  ........
00000018: f775 0113 4f70 7573  .u..Opus
00000020: 4865 6164 0101 3801  Head..8.
00000028: 80bb 0000 0000 004f  .......O
00000030: 6767 5300 0000 0000  ggS.....
00000038: 0000 0000 00c9 3cfe  ......&lt;.
00000040: 0301 0000 00ae 941c  ........
00000048: 4e01 2f4f 7075 7354  N./OpusT
00000050: 6167 731f 0000 006c  ags....l
00000058: 6962 6f70 7573 2031  ibopus 1
</code></pre>

<h3 id="the-difference"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#the-difference">The Difference</a></h3>

<pre><code class="language-_">Original                                  Stripped
00000040: 0301 0000 0035 dfaf  .....5.. | 00000040: 0301 0000 00ae 941c  ........
00000048: 0601 9a4f 7075 7354  ...OpusT | 00000048: 4e01 2f4f 7075 7354  N./OpusT
</code></pre>

<p>So, something is happening in bytes 45 - 50. But what?</p>

<blockquote><p>A page is a header of 26 bytes, followed by the length of the data, followed by the data. The constructor is givin a file-like object pointing to the start of an Ogg page. After the constructor is finished it is pointing to the start of the next page</p>

<p><a href="https://github.com/quodlibet/mutagen/blob/master/mutagen/ogg.py#L35">Mutagen Source Code</a></p></blockquote>

<p>Unfortunately, my brain freezes up when I see things like</p>

<pre><code class="language-python">header = struct.unpack('&lt;4sBBqIIiB', header_data)
</code></pre>

<p>But the code does point to the <a href="https://wiki.xiph.org/Ogg#Ogg_page_format">Ogg page format specification</a>.</p>

<blockquote><p>The LSb (least significant bit) comes first in the Bytes. Fields with more than one byte length are encoded LSB (least significant byte) first.</p></blockquote>

<pre><code class="language-_">  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | capture_pattern: Magic number for page start "OggS"           | 0-3
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | version       | header_type   | granule_position              | 4-7
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                                                               | 8-11
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                               | bitstream_serial_number       | 12-15
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                               | page_sequence_number          | 16-19
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                               | CRC_checksum                  | 20-23
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                               | page_segments | segment_table | 24-27
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | ...                                                           | 28-
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</code></pre>

<p>So, it is the CRC Checksum which is different.  The <a href="https://xiph.org/vorbis/doc/framing.html">Vorbis framing documentation</a> has a brief description of how the CRC is calculated - but the full documentation 404s.</p>

<h2 id="conclusion"><a href="https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/#conclusion">Conclusion</a></h2>

<p>Hand editing binary files is for mugs.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=34763&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2020/04/removing-default-metadata-from-opus-files/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Interesting Email Metadata]]></title>
		<link>https://shkspr.mobi/blog/2016/11/interesting-email-metadata/</link>
					<comments>https://shkspr.mobi/blog/2016/11/interesting-email-metadata/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 24 Nov 2016 08:22:49 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[NaBloPoMo]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=23558</guid>

					<description><![CDATA[For many years, my email footer said &#34;Sent via my Casio cPhone&#34; - my attempt to poke fun at the users who hadn&#039;t updated their iPhone&#039;s default email signature.  This leads to an interesting question:  Marc Blank-Settle @bbcmarc on Threads@MarcSettleIs there an easy way to see what device an email is sent from? If I type the attached on an email on my PC, can the truth be shown easily?…]]></description>
										<content:encoded><![CDATA[<p>For many years, my email footer said "Sent via my Casio cPhone" - my attempt to poke fun at the users who hadn't updated their iPhone's default email signature.</p>

<p>This leads to an interesting question:</p>

<blockquote class="social-embed" id="social-embed-791289437661622277" lang="en" itemscope="" itemtype="https://schema.org/SocialMediaPosting"><header class="social-embed-header" itemprop="author" itemscope="" itemtype="https://schema.org/Person"><a href="https://twitter.com/MarcSettle" class="social-embed-user" itemprop="url"><img class="social-embed-avatar social-embed-avatar-circle" src="data:image/webp;base64,UklGRvQCAABXRUJQVlA4IOgCAABQDQCdASowADAAPqlKmkmmJKKhNV1YAMAVCWgAnTMADmEyR/twTc3iL5rwIrg9luafLO45e+mwbJBPGCvz2heebj3GwHxLigCM4N5hIzmdkoWX8/ygW1a11IYlJZvtdVPg71YH2YHtaF0I+R9rICDpSBGAAP3MLAMqK0xVEvcXwSrdrr82qfrOZTiF/6rKyVgk2I6a0awjDrLe/0UvDir0AB2Uc7OmKSdbCILtYqMrEz25n4/Irno7+7Vm+9psnKZe2OsbqmLKZbiMyVF24qOlxD1B6oKrUWvvJdZfbg7sm2cQWK3t2mRs/CqD6glS+N6CRGDtRt5hdCV5EG6ZByYpYdbi0fJ28+JtaUKm+l+B/cfkseB7fUTGZHq6bBGHnjKpae7GQG9ptSs3R3D2s1iSvriGMM16MjYGcqo3ap9vTxhr471k1EgSKrd8YiaW6AWKWu1Vcn6deVObhoR+eICJz7lg3J4cVQzahme1wdG8vNJHDhpSMbJel8Xn/1im6sv6NLbuszyeveA7F/0cTS2Ss6csCh/iz+bvL9POL6rdlG+6GlwXyFhOTNVNrpyatEvN4PnXd0yvA3QWYIWILLyBGX67s58dHnGmObNlIDqp0DfpvSQbSmKt0iB/ntkMzgW23p4a+/TcSFJf5JXe8HbmN0vZt3SqaoDfYOwmyUwS1EZ+qekeCWbxxb2HkkzYgf8FkWlMss77EizgV9P9YT67g50r+JPOgAXFm3QTdYO8E/HZI9Nzo47zCKlEbZZIDrkjDQHjl2VyL85owAPqqZTrgYQFgYDn/Jl8TmIQk+B6C04OMrTnxpbUvxtBAyTwSwCKbDpKGR7xan0Tk4h+rUATCL9jmQdQ0ustYhNj2mO+CGBL145TpKVd5z8k89EI/YeyZvt3wg84+0KEAyW+DE01IEf4n1QzSmjk8LYxs7QBVHJm5oAMJMp5yHzzr6KhljCkOyRv9oM0qSFJuwmK4bbNkwUukQcEAAA=" alt="" itemprop="image"><div class="social-embed-user-names"><p class="social-embed-user-names-name" itemprop="name">Marc Blank-Settle @bbcmarc on Threads</p>@MarcSettle</div></a><img class="social-embed-logo" alt="Twitter" src="data:image/svg+xml,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%0Aaria-label%3D%22Twitter%22%20role%3D%22img%22%0AviewBox%3D%220%200%20512%20512%22%3E%3Cpath%0Ad%3D%22m0%200H512V512H0%22%0Afill%3D%22%23fff%22%2F%3E%3Cpath%20fill%3D%22%231d9bf0%22%20d%3D%22m458%20140q-23%2010-45%2012%2025-15%2034-43-24%2014-50%2019a79%2079%200%2000-135%2072q-101-7-163-83a80%2080%200%200024%20106q-17%200-36-10s-3%2062%2064%2079q-19%205-36%201s15%2053%2074%2055q-50%2040-117%2033a224%20224%200%2000346-200q23-16%2040-41%22%2F%3E%3C%2Fsvg%3E"></header><section class="social-embed-text" itemprop="articleBody">Is there an easy way to see what device an email is sent from? If I type the attached on an email on my PC, can the truth be shown easily? <a href="https://twitter.com/MarcSettle/status/791289437661622277/photo/1">pic.x.com/i8impwkxlo</a><div class="social-embed-media-grid"><a href="https://pbs.twimg.com/media/Cvs5Xx2WYAABS7G.jpg" class="social-embed-media-link"><img class="social-embed-media" alt="" src="data:image/webp;base64,UklGRhoHAABXRUJQVlA4IA4HAAAwJQCdASp3ATEAPrVao04nJaMiJlQ5IOAWiWlu/HA4puM5HpP/bR/HvdLbq3Tre4Kr+3o/mfAvxje9M17Fv0k6i/ZPf9/lPAn4P6gToPu7+V9AL1l+sd2LqTd1+fbvPvr/+v9gP+Xf4T/n+yr/Nfs55yvzr/W+wP/M/676aHsN/bz2Tv27HzgCyzIwksToFsn1cUM6MtgFUXTkaGMQzcaysErx1FjZhwW2mDMg8Z1R/S/Y4FfE11U0iiBANd7KC9ZHCkOPWHaevss1+5hJbMW7VY7Xjh3nTUiK55mpve5PIpwMg7UodsCxRgNHw6xYZ2nbYUchLkNDDJLZLZgWpnEZ5wGiybENQPgQGwCKfejRwyjLmLcaxgwDj2Dlx8Sxji9VhAPtS5WyfVxoBhJYnQLZNgAA/vm44JYQC4QAAY7ojW0k90B+fgLavSy7yUMsE4LN2TxJUYu3qlJ0YkJDutcMBtgXBX3Gb3egiJcE4Zo58cqO/zQMN2faXUsYWXFmTtEiq9P/kF1xDbU7vpNhJWjomR/zQF6hTOpxwg3KK98OHR72UQFIN1J80r1fFi/1ddYpoR0Qr5I8+ZAJufd76pOh8SS7MXAk3qbsF2KKH3P82sOleVafSFKTbLoWnTNWQuHWuCzo3xa/iSd0f6e5RkKKQrhzvGpTJZ3c2rUBBacmfKJM8euCd135XnNUMd2yp5oTGUmC4EDU/2lYRz17UmtOXi/F4xHzmmExRn9pOWN0oFBpi7R5iQTAS7W9VydDc8GvnEWryjdoW2S7P0ofXZ808xnn43Qim48m4vbFIEKc2s4Xmf98v4YdKnEcGLEiwpFNxJihTzZEy8MsVdx/yiJq1IvCiMRIxc7tbYuz50JLZkMAG5VE0Hz9REJAb2wV4+jEEHC+6yDr+gWK+SmYywIfsEgoONuwIDITnc95o5P9x3ORvfAvs7en5vDyo2YD4rHYssRwpAxTuXboVhLaZTZO6TJ3P810bzamOowAh0AIbV5a6d4IjxCqamjPdJogydp4qEiFmIQbLkGVkzhjorPaHVDlW+enZC+7ieYCSMfW7g1skgzO7mqvKMGtDmmAmB9HnqD2Qp+B57wf8c3IpVi8F0pmDIGRBqNqG4Miu+HvVC7vjLTVY82Fy8fNvUzQ/fZC/FPr0oRj+B6z0xHzQ67+bFvFGIdV2S8Gwk4bSL2BXfBZcL4O3scsrkiLaPJVneGd6vHBWzE5QUw7f1jt8fZMtLQlgVfyOzBErBj0vkN4LSvaPTTfDv/F/3Pk8xpRjrah3JIGPjw98uHz/1gzLi426pTEipy9YeW+rawb0rkJ6qsGz+4VA2wAHLmDOe0zMT9LPceD4kFKYAt/px/rRK8kq3sDC5tTlJQiix1k4wMY/wmIlLW1l3pEvfvVRfsYltNjgAzQHA2VRgyoc0GwBFR2R0mMP+yuBmqbYNFOxypdoRpsy1MekZEIt7QUBXh2H2/y12IM9odcRGhymusz1Gffh7QUw3VFytc33wD5wafCALMw2M/f/fVNTEyanunbYIKDXWseBGOi1Z3biWPM1HbtOvb/jko6sjw4P4Q+LvMeYwj+tZlmT5U+L92YEzhITMwKDu3MdJ3zmmLM0+KLM+d300xsYDQXsdMnxK0JYgyDpoDUzj7YZWE2dfx6aox9up4SRJEjFEYnanieS2I5n3S90oFGCaaXL+BF1G9HGwvvnKfxt3Hcdap3MA6UWVWMKx7BdQdKtw8zRtUq8NdcaakVfysBuQ9ms+VqSNNiKFdh9vbMoi4c65RIwgSMwaef4kw0ufgGgD4aOTq47iP336aKDFZeA0fYZfAGfUvBe5EEO9lTfSGdug+wULSKLPWY7dmnDlSztbiMbw9JvetQEenkSU547j6LDsrP2T+KLPydoVWoIAlfNti+SQAawM9BCZZl5JJndKZZTE0IHxWXeM6e5GzOj2qkGBNhhZjMtPtigameJuaecMRUHqHwinEOMyny7E3kyvPYIUBU51K5NnHGhofz4P7tsIU7xx6cxLxKhAri+nuP2RwWKMAxpu6whxjBfPjradxP0p3B9o86RwuiCh4yqJUL/OfeSjZPRprPM/tfVPJgRO83snTBf3qoE1kQx5lDum4MbM3oquxIajdH3lTCo476BI/xKwKL5IrLNgSDUX0M5ORTIluhYCSkiZcH8ub64w1ne/GPsWw5JE7mBx5VHAjbVg1/PkAmqup3ORTSP8ReRba5cmxkneDss7wUxqIrhrXZUaj6oTsg0Ptk9w5GKsT8rvTBSJYE2nTrw3WOF+QX4K0k0aM9I7DkNoNgr2G5r5jf2HXtk8X5q4FaXMJKatlrquz3pLin3937yCQP4ZoNc/n+3TWr+r3n7wDFsIsAAQ0FazP/ZAABkY2IN8/KaAPWgAc9gAAAAAA="></a></div></section><hr class="social-embed-hr"><footer class="social-embed-footer"><a href="https://twitter.com/MarcSettle/status/791289437661622277"><span aria-label="1 likes" class="social-embed-meta">❤️ 1</span><span aria-label="3 replies" class="social-embed-meta">💬 3</span><span aria-label="0 reposts" class="social-embed-meta">🔁 0</span><time datetime="2016-10-26T14:44:38.000Z" itemprop="datePublished">14:44 - Wed 26 October 2016</time></a></footer></blockquote>

<p>Because 2016 is <em>maximum news</em>, I'm sure there are some interesting stories based on email releases which have been missed.  Metadata tells stories.</p>

<p>So, what metadata can we pick up from an email?</p>

<p>In GMail, it's quite easy to see all the raw data sent with an email as it travels through the Internet.</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2016/10/Show-Original-In-Gmail.png" alt="show original in gmail" width="268" height="305" class="aligncenter size-full wp-image-23559">

<p>Let's take a look at some of the more interesting fields.</p>

<p>Here's an email that I've sent from my mobile - I've redacted some bits for my privacy.</p>

<pre><code>Received: from [192.168.1.42] (oxfd.cable.virginm.net. [82.6.ZZZ.ZZZ])
 by smtp.gmail.com with ESMTPSA id l6sm9069017wmg.11.2016.10.08.09.37.57
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sat, 08 Oct 2016 09:37:58 -0700 (PDT)
</code></pre>

<p>Well, first off we can see the sender's <em>internal</em>&nbsp;IP address.  That gives us a little insight into their network topology. Of more interest is the sender's <em>external</em> IP address.</p>

<p>This can leak all sorts of interesting information.  Location, service provider, connection speed - even ISP contract details in some cases.</p>

<p>Let's suppose someone sends an email which says "Sorry, at home with the flu today."  You check the IP address and find that they're connected to the WiFi at Disney World.  Isn't that interesting...</p>

<p>A little further down the headers, we find (again, redacted)</p>

<pre><code>Message-ID: &lt;yqwertyuigm4u5v.1471234534@com.syntomo.email&gt;
</code></pre>

<p>Oh ho! What do we have here? <a href="https://en.wikipedia.org/wiki/Message-ID">The Message-ID</a> is a unique string. Most email clients will choose a unique suffix.</p>

<p>This means, if you received this message from me, you could tell which email program I used and (possibly) which device.</p>

<p>So if I send you an email saying "sorry, my phone is broken" - you'll be able to tell if that's a lie.</p>

<p>There's another leak of client information at the <a href="http://stackoverflow.com/questions/3508338/what-is-the-boundary-in-multipart-form-data">multipart boundary</a></p>

<pre><code>Content-Type: multipart/alternative; boundary="--_com.syntomo.email_596674815977850"

----_com.syntomo.email_596674815977850
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: base64

SGVyZSB3ZSBnbyEg
</code></pre>

<h2 id="much-more"><a href="https://shkspr.mobi/blog/2016/11/interesting-email-metadata/#much-more">Much more</a></h2>

<p>This brief blog post only scratches the surface of what can be found - and what you could do with the information.</p>

<p>Other "interesting" metadata includes:</p>

<ul>
    <li>User's Timezone - not as accurate as an IP address, but if their phone says they're at GMT+2 but they claim to be at GMT-7, is that interesting?</li>
    <li>Reply threading - was this email originally a reply?</li>
    <li>What language their equipment is set to. Some email headers contain <code>Accept-Language:</code> and <code>Content-Language:</code> information. Why is your "Urgent email from the FBI" sent from computer that's set to Chinese?</li>
    <li>Software versions - do the sender's servers have known vulnerabilities?</li>
    <li>Operating System - is the sender's equipment up to date?</li>
</ul>

<p>I'm sure there are several other pieces of information which could prove interesting.</p>

<h2 id="manipulation"><a href="https://shkspr.mobi/blog/2016/11/interesting-email-metadata/#manipulation">Manipulation</a></h2>

<p>This is <strong>not</strong> a cast iron investigative tool.  It is possible for programs to mangle the metadata - either deliberately or not. Some people will take care to mask their email footprint, others will not.</p>

<p>Metadata is <em>everywhere</em>.  While your emails are unlikely to get leaked to the press (I hope!) you should consider just how easy it is for a little white lie to be uncovered.</p>

<p>Sent from my iPhone.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=23558&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2016/11/interesting-email-metadata/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
	</channel>
</rss>
