Updating all the examples in the HTML5 Spec


I'm currently helping to edit the HTML5 specification. As part of our preparations for HTML5.3 I've started going through the provided examples and improving them. This blog post explains the what, why, and when of the process. You can follow along on GitHub.

How is the Spec Written?

The spec is written using a mash-up of HTML and MarkDown which is then run through Bikeshed to produce beautiful, pure, unsullied HTML.

There is a small problem with HTML... It's hard to display HTML in HTML. That is, if I want to talk about the <nav> element, I need to escape the HTML elements and write:

<pre>&lt;nav&gt;</pre>

That's just about readable for short snippets. But consider this genuine (although admittedly extreme) example:

&lt;pre&gt;&lt;code class=&quot;lang-c&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;for&lt;/span&gt; (&lt;span class=&quot;ident&quot;&gt;j&lt;/span&gt; = 0; &lt;span class=&quot;ident&quot;&gt;j&lt;/span&gt; &lt; 256; &lt;span class=&quot;ident&quot;&gt;j&lt;/span&gt;++) {
  &lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt; = (&lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt; &amp; 0x1ffff) | (&lt;span class=&quot;ident&quot;&gt;j&lt;/span&gt; &lt;&lt; 17);
  &lt;span class=&quot;ident&quot;&gt;i_t6&lt;/span&gt; = (((((((&lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt; &gt;&gt; 3) ^ &lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt;) &gt;&gt; 1) ^ &lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt;) &gt;&gt; 8) ^ &lt;span class=&quot;ident&quot;&gt;i_t3&lt;/span&gt;) &gt;&gt; 5) &amp; 0xff;
  &lt;span class=&quot;keyword&quot;&gt;if&lt;/span&gt; (&lt;span class=&quot;ident&quot;&gt;i_t6&lt;/span&gt; == &lt;span class=&quot;ident&quot;&gt;i_t1&lt;/span&gt;)
    &lt;span class=&quot;keyword&quot;&gt;break&lt;/span&gt;;
}&lt;/code&gt;&lt;/pre&gt;

BLEURGH! YUK! Also, pretty hard to maintain. I've found dozens of examples which have errors in them; possibly because they're unreadable.

Can you quickly and intuitively spot the error in this example?

&lt;form&gt;&lt;div&gt;&lt;label&gt;Customer name:;lt;input&gt;&lt;/label&gt;&gt;/div&gt;&lt;/form&gt;

Luckily, there is a way we can write HTML without having to escape it. The <xmp> element!

Let's write some eXaMPles!

  <xmp highlight="html">
    <form>
      <div><label>Customer name: lt;input></label>>/div>
    </form>
  </xmp>

Wow! Suddenly easier to read. Makes it quicker to edit and to find mistakes.

How to fix it.

Doing a quick grep -rinc "pre highlight=\"html" | grep -v :0 through the spec showed around 600 examples. Ideally I'd run some magic/tragic one line Unix command and everything would be fixed. Sadly, reality got in the way!

Some examples use unescaped markup to highlight specific parts of the examples. Some mix CSS and HTML. Some still use UPPERCASE element names. Some haven't been updated since the stone-age. Some are needlessly verbose and encumbered with excess verbiage which makes it, inter alia, complex to process.

So, I've been going through each example individually. Converting <pre> to <xmp> where possible, updating the examples, simplifying them where necessary, and generally giving them a good old tidy-up.

I'm indebted to two specific Atom plugins:

You're my only hope!

I think I've done this right. But I'm not sure. I've gone through each example I've changed and compared it to the original - they look fine. Nevertheless, I'm certain I've made mistakes somewhere.

I'd be jolly grateful if you could cast your eye over this ridiculously large diff and point out where I've messed things up.

Seriously. Go here - https://github.com/w3c/html/pull/1199 - and get stuck in.

THANKS!


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

3 thoughts on “Updating all the examples in the HTML5 Spec”

  1. Thomas says:

    Long time fan of XMP for the reason you mention and for others (such as displaying the contents of a script tag for human consumption–read, “learning/teaching examples”…and then being able to use the same piece of code to run it as script–the ultimate DRY). I gave up long ago trying to convince folks to reconsider XMP–as long as browsers provide support, I’ll leave the politics to the politicians. But if you gain traction on your quest, you’ve got my support.

    Reply
  2. Frederick Yocum says:

    Huhhh. Why haven’t I noticed the xmp element before. . . I seems incredibly useful for documentation purposes.

    Reply

Trackbacks and Pingbacks

What links here from around this blog?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">