Terence Eden’s Blog

Is IPA furigana a bad idea?

@edent HTML metadata schema.org · 7 comments · 300 words

My name is Terence(/ˈtɛɹəns) Eden(ˈiːdən/). Modern HTML allows the user to use <ruby> to annotate text. This is usually used for furigana - which allows pronunciation to be placed above words. For example: "シン・ゴジラ (Shin Godzilla)" shows you how to pronounce both words if you are unfamiliar with kanji. The text can be any language or use any characters. In Japanese, it is quite often used to show phonetic pronunciation using hiragana. Because English is a composite language, it isn't always …

Continue reading →

What programming language is in this <code> block?

@edent HTML schema.org semantic web · 7 comments · 650 words · Viewed ~2,103 times.

Screenshot of the Schema.org site showing the technical information about the metadata.

I'm a little bit obsessed with the idea of Semantic markup. I want the words that I write to be understood my humans and machines. Imagine this piece of code: print( "Hello, world!" ) Is that code example written in Python? C++? Basic? Go? Perhaps you're familiar enough with every programming language to tell - but most people aren't. Wouldn't it be nice to give an indication of what programming language is used in an example? Here's how we might represent it in HTML: <pre> <code> …

Continue reading →

Addressing the Overlooked Non-Micropsychiatric Uses for Thiotimoline

@edent AI schema.org · 2 comments · 700 words

A chair specifically designed to but awkward - it has a bowed seat and leans forward at an uncomfortable angle.

One of the (many) problems with AI is that training data usually needs to come from "natural" sources. If you want to emulate human-written text, you need to train something on human-written text. But with the proliferation of cheap and fast AI tools, it is likely that training data will unwillingly become contaminated with AI-written text. In order to prevent the "Habsburg Jaw" effect, I think it would be helpful if publishers could semantically indicate that a work was written by AI. In…

Continue reading →

LinkedIn supports Schema‎.org metadata

@edent linked data linkedin schema.org · 4 comments · 600 words

I'm a big fan of machine-readable metadata. It's useful for programs which need to extract information from messy and complicated websites. It's always surprising where it turns up. For example, take this post of mine on LinkedIn. If you view the source, you'll see this scrap of linked data: <script type="application/ld+json"> { "@context": "http://schema.org", "@type": "SocialMediaPosting", "@id":…

Continue reading →

Does AI mean we don't need the Semantic Web?

@edent HTML5 schema.org semantic web · 4 comments · 350 words · Viewed ~2,141 times.

User: Read the following sentence: "Our opening hours are: Weekdays 10 until 7. Weekend 10 until 10 (Early closing 9 o'clock Sunday)." ChatGPT: The given sentence states the opening hours of a certain establishment. Here is the breakdown of the opening hours mentioned: Weekdays: The establishment is open from 10 in the morning until 7 in the evening. Weekend: On weekends, the establishment is open from 10 in the morning until 10 at night. Early closing on Sunday: On Sundays, the establishment closes earlier than usual, at 9 o'clock in the evening. Please note that this information is based solely on the provided sentence, and the specific establishment or its type is not mentioned. User: Is the establishment open at 2130 on Sunday? ChatGPT: Based on the information provided in the sentence, the establishment has early closing on Sunday, with the closing time set at 9 o'clock in the evening. Therefore, the establishment would not be open at 21:30 (or 9:30 PM) on Sunday.

If you hang around with computerists long enough, they start talking about the Semantic Web. If you can represent human knowledge in a way that's easy for computers to understand it will be transformative for information processing. But computers, traditionally, haven't been very good at parsing ambiguous human text. Suppose you saw this text written for a human: Our opening hours are: Weekdays 10 until 7. Weekend 10 until 10 (Early closing 9 o'clock Sunday). Not the most straightforward…

Continue reading →

Adding restaurant review metadata to WordPress

@edent plugin schema.org WordPress · 4 comments · 350 words

Screenshot of a user interface which allows the entry of data.

I've started adding Restaurant Reviews to this blog - with delicious semantic metadata. Previously I'd been posting all my reviews to HappyCow. It's a great site for finding veggie-friendly food around the worlds, but I wanted to experiment more with the IndieWeb idea of POSSE. So now I can Post on my Own Site and Syndicate Elsewhere. The Schema.org representation of a Restaurant is pretty simple: "itemReviewed": { "@type": "Restaurant", "name": "Example Bistro", "address": "42 …

Continue reading →

Semantic Comments for WordPress

@edent HTML5 meta metadata schema.org semantic web WordPress · 2 comments · 850 words

As regular readers will know, I love adding Semantic things to my blog. The standard WordPress comments HTML isn't very semantic - so I thought I'd change that. Here's some code which you can add to your blog's theme - an an explanation of how it works. The aim is to end up with some HTML which looks like this (edited for brevity): <li itemscope itemtype="https://schema.org/Comment" itemid="#246827"> <article> <time…

Continue reading →

How to add ISSN metadata to a web page

@edent HTML5 metadata schema.org · 5 comments · 450 words · Viewed ~2,444 times.

creenshot of an email from the British Library. Dear Terence Eden INTERNATIONAL STANDARD SERIAL NUMBER (ISSN) Thank you for your recent enquiry, we have assigned ISSN to the following publication(s): Terence Eden’s blog ISSN 2753-1570 .

Inspired by John Hoare at the Dirty Feed blog - I've asked the British Library to assign my blog an International Standard Serial Number (ISSN). An ISSN is an 8-digit code used to identify newspapers, journals, magazines and periodicals of all kinds and on all media–print and electronic. Why? Shut up. OK. It turns out that lots of people cite my blog in academic papers - so I wanted to make it slightly easier for scholars of the future to use metadata to trace my vast influence on Human …

Continue reading →

Introducing DOI2HT.ML - the simple semantic citation server

@edent HTML5 NaBloPoMo schema.org semantic web · 4 comments · 250 words · Viewed ~342 times.

Academic citations are hard. One of the joys of the Digital Object Identifier System (DOI) is that every academic paper gets a unique reference - like: 10.34053/artivate.8.2.2. As well as always leading you to a URl of the paper, a DOI also provides lots of metadata. Things like author, publisher, ORCID, year of publication etc. I've built a simple website that turns any DOI into a semantic HTML reference - get started at DOI2HT.ML. Here's what it looks like: Pop a DOI in the box and hit…

Continue reading →

Adding Semantic Reviews / Rich Snippets to your WordPress Site

@edent metadata php schema.org WordPress · 1,200 words

This is a real "scratch my own itch" post. I want to add Schema.org semantic metadata to the book reviews I write on my blog. This will enable "rich snippets" in search engines. There are loads of WordPress plugins which do this. But where's the fun in that?! So here's how I quickly built it into my open source blog theme. Screen options First, let's add some screen options to the WordPress editor screen. This is what it will look like when done: This is how to add a custom metabox to…

Continue reading →

The future of the web, isn't the web

@edent data HTML schema.org web · 11 comments · 650 words · Viewed ~5,356 times.

A fist emerges from a computer screen and punches the user.

My friends, and former employers, at the Government Digital Service have written a spectacularly good blog post "Making GOV.UK more than a website". In it, they describe how adding Schema.org markup to their website has allowed search engines to extract semantic content and display it to a user. For example, the "Learn to drive" page has content which can appear directly in a search engine: Even better, if you ask Siri / Google / Alexa for something, it can give an answer from an…

Continue reading →

Add review to Goodreads from Schema markup

@edent api code goodreads hack python schema.org · 600 words

I write book reviews on my blog. I also want to syndicate them to Goodreads. Sadly, Goodreads doesn't natively read the Schema.org markup I so carefully craft. So here's the scrap of code I use to syndicate my reviews. Goodreads API Keys Get your Keys from https://www.goodreads.com/api/keys You will also need to get OAuth tokens For this documentation, I'll use the example keys - please substitute them with your own keys. from rauth.service import OAuth1Service, OAuth1Session # Get a…

Continue reading →