php – Terence Eden’s Blog

PHP - simple way to send HTTP headers before a script ends

@edent — Mon, 25 May 2026 11:34:38 +0000

Suppose you want PHP to keep processing after it has sent back an HTTP response. Normally, this doesn't work:



Try it yourself. You'll have to wait 10 seconds before you get back

< HTTP/2 302 
< location: https://example.com/


There are some complex ways to fix this - they usually involve spawning sub-processes or having a cron job run something. But there's a simpler way!

Most servers do some form of output buffering. They wait for the buffer to fill (or be explicitly terminated) before they send any content. My server was set to a buffer of 4,096 bytes. So I forced some dummy output to fill it up, then told PHP to flush the buffer:



Some clients, like Python's Requests, wait until they've explicitly seen the end of the response before processing it.

But, for something like curl, the above is sufficient.



Vertically Aligning Roman Numerals in Code
@edent — Sun, 03 May 2026 11:34:59 +0000
I have a PHP function which uses Roman Numerals. It looks like this:

$romanNumerals = [
    "Ⅿ"  => 1000,
    "ⅭⅯ" => 900,
    "Ⅾ"  => 500,
    "ⅭⅮ" => 400,
    "Ⅽ"  => 100,
    "ⅩC" =>  90,
    "Ⅼ"  =>  50,
    "ⅩⅬ" => 40,
    "Ⅹ"  => 10,
    "Ⅸ"  => 9,
    "Ⅷ" => 8,
    "Ⅶ"  => 7,
    "Ⅵ"  => 6,
    "Ⅴ"   => 5,
    "Ⅳ"  => 4,
    "Ⅲ"  => 3,
    "Ⅱ"  => 2,
    "Ⅰ"   => 1
];


The problem is, the operators don't line up and the whole thing looks messy. Why? Because the Unicode Roman Numerals are not monospaced! ⅭⅯ is a different width to ⅩC and Ⅷ is only a single character!  Copy the above to a text editor and see if you can get neat columns. I bet you can't!

I'm obsessed with vertically aligning my code. So how to solve this ugly problem?

The answer was simple. Assign keys to the values and then flip the array!

$romanNumerals = array_flip([
    1000 => "Ⅿ",
     900 => "ⅭⅯ",
     500 => "Ⅾ",
     400 => "ⅭⅮ",
     100 => "Ⅽ",
      90 => "ⅩC",
      50 => "Ⅼ",
      40 => "ⅩⅬ",
      10 => "Ⅹ",
       9 => "Ⅸ",
       8 => "Ⅷ",
       7 => "Ⅶ",
       6 => "Ⅵ",
       5 => "Ⅴ",
       4 => "Ⅳ",
       3 => "Ⅲ",
       2 => "Ⅱ",
       1 => "Ⅰ"
]);


There! Doesn't that look much neater!

As was written long ago:

A computer language is not just a way of getting a computer to perform operations but rather … it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.



You can parse an .env file as an .ini with PHP - but there's a catch
@edent — Sat, 25 Apr 2026 11:34:15 +0000
The humble .env file is a useful and low-tech way of storing persistent environment variables. Drop the file on your server and let your PHP scripts consume it with glee.

But consume it how? There are lots of excellent parsing libraries for PHP. But isn't there a simpler way? Yes! You can use PHP's parse_ini_file() function and it works.

But…

.env and .ini have subtly different behaviour which might cause you to swear at your computer.

Let's take this example:

# This is a comment
USERNAME="edent"


Run $env = parse_ini_file( ".env" ); and you'll get back an array setting the USERNAME to be "edent". Hurrah! Works perfectly. Ship it!

But consider this:

# This is a comment
USERNAME="edent" # Don't use an @ symbol here.


It will happily tell you that the username is "edent# Don"

WTAF?

Here's the thing. The comment character for .ini is not # - it's the semicolon ;

Let me give you some other examples of things which will fuck up your parsing:

# Documentation at https:/example.com/?doc=123
DOCUMENTATION=123
# Set the password
PASSWORD=qwerty;789


That gets us back this PHP array:

[
  '# Documentation at https:/example.com/?doc' => '123',
  'DOCUMENTATION' => '123',
  'PASSWORD' => 'qwerty',
];


When the .ini is parsed, it ignores every line which doesn't have an = sign. It also treats literal semicolons as the start of a new comment until they're wrapped in quotes.

My code highlighter should show you how it is parsed:

# Documentation at https:/example.com/?doc=123
DOCUMENTATION=123
# Set the password
PASSWORD=qwerty;789


It gets worse. Consider this:

# Set the "official" name
REALNAME="Arthur, King of the Britons"


That immediately fails with PHP Warning:  syntax error, unexpected '"' in envtest on line 1

You can use single quotes in pseudo-comments just fine, but if the ini parser sees a double quote without an equals then it throws a wobbly.

I'm sure there are several other gotchas as well. For example, there are certain reserved words and symbols you can't used as a key.

This will fail:

# Can we fix it? Yes we can!
FIX=true


It chokes on the exclamation point.

How to solve it (the stupid way)

The comments on an .env file start with a hash.

The comments on an .ini file start with a semicolon.

So, it is perfectly valid for a hybrid file to have its comments start with #;

Look, if it's stupid but it works…

What Have We Learned Here Today?


There's a right way and a wrong way to do .env parsing.
The wrong way works, up until the point it doesn't.
You should probably use a proper parser rather than hoping your .env looks enough like an .ini to pass muster.


On next week's show - why you shouldn't store your passwords inside a JPEG!



Some updates to ActivityBot
@edent — Mon, 16 Mar 2026 12:34:57 +0000
I couple of years ago, I developed ActivityBot - the simplest way to build Mastodon Bots. It is a single PHP file which can run an entire ActivityPub server and it is less than 80KB.

It works! You can follow @openbenches@bot.openbenches.org to see the latest entries on OpenBenches.org, and @colours@colours.bots.edent.tel for a slice of colour in your day, and @solar@solar.bots.edent.tel to see what my solar panels are up to.

This is so easy to use. Copy the PHP file (and a .env and .htaccess) to literally any web host running PHP 8.5 and you have a fully-fledged bot which can post to Mastodon.

Grab the code and start today!

Features

Over the years I've added a few more features to it, so I thought I'd run through what they are. Note, this is all hand-written. No sycophantic plagiarism machines were involved in this code or blog post. I just really like emoji, OK⁉️

🔍 Be discovered on the Fediverse

This is the big one, you can find @example@example.viii.fi on your favourite Fediverse client.  This is thanks to its WebFinger support.

👉 Be followed by other accounts

No point being discovered if you can't be followed. This accepts follow requests and sends back a signed accept.

🚫 Be unfollowed by accounts

Sometimes people want to unfollow. Too bad, so sad. Again, this will accept the undo request and delete the unfollowing user's information.

📩 Send messages to the Fediverse

If a bot can be followed, but never posts, does it make a sound? This sends a post to all of your followers' (shared) inboxes. Includes some HTML formatting.

💌 Send direct messages to users

Not every message is for the wider public. If you want a bot which sends you a private message, this'll set the visibility correctly.

📷 Attach images & alt text to a message 🆕🆕

A picture is worth a thousand words. But those pictures are meaningless without alt text. Attach as many images as you like. Note, most Mastodon services only accept a maximum of four.

🍿 Video Upload 🆕🆕

No transcoding or anything fancy. Upload a video and it'll be sent to your followers.

🔊 Audio Upload 🆕🆕

Same as video. Raw audio posted to your followers' feeds.

🕸️ Autolink URls, hashtags, and @ mentions

Including URls, tags, and mentions are mostly autolinked correctly. There's a lot of fuzziness in how it works.

🧵 Threads

You can reply to specific messages in order to create a thread.

👈 Follow, Unfollow, Block, and Unblock other accounts

It might be useful for you to remove followers or follow specific accounts.

🗑️ Delete posted messages and their attachments 🆕🆕

We all make mistakes. This will delete your post along with any attachments and send that delete message to everyone. Note, because of the federated nature of the Fediverse, you cannot guarantee that a remote server will delete anything.

✏️ Edit Posts 🆕🆕

If you don't want to delete and re-post, you can edit your existing posts.

🦋 Bridge to BlueSky with your domain name via Bridgy Fed

Not everyone is on the Fediverse. If you want to bridge to BlueSky, you can use the Bridgy Fed service.

🚚 Move followers from an old account and to a new account 🆕🆕

Perhaps you started as @electric@sex.pants but now you want to become @chaste@nunslife.biz - no worries! You can tell followers you've moved and what your new name is.

Similarly, if ActivityBot is no longer right for you, it's simple to tell your existing follower to move to your new account.

🗨️ Allow quote posts 🆕🆕

Rather than just reposting your message, this sets the quote policy to allow people to share your message and attach some commentary of your own.

👀 Show followers

Your follower count isn't just a number, it is a living list of who chooses to follow you.

⚠️ Content Warnings 🆕🆕

Perhaps you want to hide a bit of what you're saying. Add a content warning to hide part of your message.

🔏 Verify cryptographic signatures

HTTP Message Signatures is hard. I think I've mostly got it sorted.

🪵 Log sent messages and errors

This is primarily a learning aide, so have a rummage through the logs and see what's going on.

🚮 Clear logs when there are too many

ActivityPub is a chatty protocol. Your server can easily fill up with hundreds of thousands of messages from others. This regularly prunes down to something more manageable.

#️⃣ Hashed passwords for posting 🆕🆕

Bit of a guilty moment here. I was originally storing the password in plaintext. Naughty! Passwords are now salted and hashed.

💻 Basic website for showing posts

A nice-enough looking front end if people want to view the posts directly on your domain.

Some Deficiencies

Not every piece of software is perfect. ActivityBot is less perfect than most things. Here are some of the things it can't do and, perhaps, will never do.  If you'd like to help tackle any of these, fork the code from my git repo!

⏳ Retry Failed Messages

A proper Mastodon server will keep trying to send messages to unresponsive hosts. ActivityBot is one-and-done. If a remote server didn't respond in time, or was offline, or something else went wrong - it may not get the message.

🔄 Reposts / Announce / Quote

You cannot boost other posts, or even your own. Nor can you send quote posts.

🤖 Act On Instructions

This is a basic bot. It contains no logic. If you send it a message asking it to take action, it will not. You will need to build something else to make it truly interactive.

📥 Receive Messages

In fact, other than the follow / unfollow stuff, the bot can't receive any messages from the Fediverse. It doesn't know when a post has been replied to, liked, or reposted.

😎 Set Post Visibility

Your posts are either public or a DM. There's no support for things like quiet followers.

📊 Create Polls

Everyone loves to vote on meaningless polls - but this is quite a hard problem for ActivityBot. It would need to keep track of votes, prevent double voting, and probably some other difficult stuff.

🗨️ Change Quote Post Visibility

As quote posts are still quite new to Mastodon, I'm not sure how best to implement this.

🔗 Proper HTML / Markdown Support

Autolinking names, hashtags, and links just about works - but not very reliably. In theory the bot could parse Markdown and create richly formatted HTML from it. But that may require an external library which would bloat the size. Perhaps posting raw HTML could work?

🖼️ Focus Points for Images

Perhaps of less use now, but still of interest to people?

❓ Other Stuff

I don't know what I don't know. Maybe some stuff is total broken? Maybe it is wildly out of spec? If you spot something dodgy, please let me know or raise a Pull Request.



A big list of things I disable in WordPress
@edent — Sun, 30 Nov 2025 12:34:23 +0000
There are many things I like about the WordPress blogging software, and many things I find irritating. The most annoying aspect is that WordPress insists that its way is the best and there shall be no deviance. That means a lot of forced cruft being injected into my site. Headers that bloat my page size, Gutenberg stuff I've no use for, and ridiculous editorial decisions.

To double-down on the annoyance, there's no simple way to turn them off. In part, that is due to the "WordPress Philosophy":

Decisions, not options

[…] Every time you give a user an option, you are asking them to make a decision. When a user doesn’t care or understand the option this ultimately leads to frustration.

I broadly agree with that. Having hundreds of options is a burden for users and a nightmare for maintainers. Do please read this excellent discussion from Tom McFarlin for a more detailed analysis.

But I want to turn things off. Luckily, there is a way. If you're a developer, you can remove a fair number of these "enforced" decisions. Add the following to your theme's functions.php file and watch the mandatory WordPress bloat whither away.  I've commented each removal and, where possible, given a source for more information.  Feel free to leave a comment suggesting how this script can be improved and simplified.

//  Remove mandatory classic theme.
function disable_classic_theme_styles() {
    wp_deregister_style( "classic-theme-styles" );
    wp_dequeue_style(    "classic-theme-styles" );
}
add_action( "wp_enqueue_scripts", "disable_classic_theme_styles" );

//  Remove WP Emoji.
//  http://www.denisbouquet.com/remove-wordpress-emoji-code/
remove_action( "wp_head",             "print_emoji_detection_script", 7 );
remove_action( "wp_print_styles",     "print_emoji_styles"              );
remove_action( "admin_print_scripts", "print_emoji_detection_script"    );
remove_action( "admin_print_styles",  "print_emoji_styles"              );
//  https://wordpress.org/support/topic/remove-the-new-dns-prefetch-code/
add_filter( "emoji_svg_url", "__return_false" );

//  Stop emoji replacement with images in RSS / Atom Feeds
//  https://danq.me/2023/09/04/wordpress-stop-emoji-images/
remove_filter( "the_content_feed", "wp_staticize_emoji" );
remove_filter( "comment_text_rss", "wp_staticize_emoji" );

//  Remove automatic formatting.
//  https://css-tricks.com/snippets/wordpress/disable-automatic-formatting/
remove_filter( "the_content",  "wptexturize" );
remove_filter( "the_excerpt",  "wptexturize" );
remove_filter( "comment_text", "wptexturize" );
remove_filter( "the_title",    "wptexturize" );

//  More formatting crap.
add_action("init", function() {
    remove_filter( "the_content", "convert_smilies", 20 );
    foreach ( array( "the_content", "the_title", "wp_title", "document_title" ) as $filter ) {
        remove_filter( $filter, "capital_P_dangit", 11 );
    }
    remove_filter( "comment_text", "capital_P_dangit", 31 );    //  No idea why this is separate
    remove_filter( "the_content",  "do_blocks", 9 );
}, 11);

//  Remove Gutenberg Styles.
//  https://wordpress.org/support/topic/how-to-disable-inline-styling-style-idglobal-styles-inline-css/
remove_action( "wp_enqueue_scripts", "wp_enqueue_global_styles" );

//  Remove Gutenberg editing widgets.
//  From https://wordpress.org/plugins/classic-widgets/
//  Disables the block editor from managing widgets in the Gutenberg plugin.
add_filter( "gutenberg_use_widgets_block_editor", "__return_false" );
//  Disables the block editor from managing widgets.
add_filter( "use_widgets_block_editor", "__return_false" );

//  Remove Gutenberg Block Library CSS from loading on the frontend.
//  https://smartwp.com/remove-gutenberg-css/
function remove_wp_block_library_css() {
    wp_dequeue_style( "wp-block-library"       );
    wp_dequeue_style( "wp-block-library-theme" );
    wp_dequeue_style( "wp-components"          );
}
add_action( "wp_enqueue_scripts", "remove_wp_block_library_css", 100 );

//  Remove hovercards on comment links in admin area.
//  https://wordpress.org/support/topic/how-to-disable-mshots-service/#post-12946617
add_filter( "akismet_enable_mshots", "__return_false" );

//  Remove Unused Plugin code.
function remove_plugin_css_js() {
    wp_dequeue_style( "image-sizes" );
}
add_action( "wp_enqueue_scripts", "remove_plugin_css_js", 100 );

//  Remove WordPress forced image size
//  https://core.trac.wordpress.org/ticket/62413#comment:40
add_filter( "wp_img_tag_add_auto_sizes", "__return_false" );

//  Remove  enhancements
//  https://developer.wordpress.org/reference/functions/wp_filter_content_tags/
remove_filter( "the_content",  "wp_filter_content_tags", 12 );

//  Stop rewriting http:// URls for the main domain.
//  https://developer.wordpress.org/reference/hooks/wp_should_replace_insecure_home_url/
remove_filter( "the_content", "wp_replace_insecure_home_url", 10 );

//  Remove the attachment stuff
//  https://developer.wordpress.org/news/2024/01/building-dynamic-block-based-attachment-templates-in-themes/
remove_filter( "the_content", "prepend_attachment" );

//  Remove the block filter
remove_filter( "the_content", "apply_block_hooks_to_content_from_post_object", 8 );

//  Remove browser check from Admin dashboard.
//  https://core.trac.wordpress.org/attachment/ticket/27626/disable-wp-check-browser-version.0.2.php
if ( !empty( $_SERVER["HTTP_USER_AGENT"] ) ) {
    add_filter( "pre_site_transient_browser_" . md5( $_SERVER["HTTP_USER_AGENT"] ), "__return_null" );
}

//  Remove shortlink.
//  https://stackoverflow.com/questions/42444063/disable-wordpress-short-links
remove_action( "wp_head", "wp_shortlink_wp_head" );

//  Remove RSD.
//  https://wpengineer.com/1438/wordpress-header/
remove_action( "wp_head", "rsd_link" );

//  Remove extra feed links.
//  https://developer.wordpress.org/reference/functions/feed_links/
add_filter( "feed_links_show_comments_feed", "__return_false" );
add_filter( "feed_links_show_posts_feed",    "__return_false" );

//  Remove api.w.org link.
//  https://wordpress.stackexchange.com/questions/211467/remove-json-api-links-in-header-html
remove_action( "wp_head", "rest_output_link_wp_head" );
//  https://wordpress.stackexchange.com/questions/211817/how-to-remove-rest-api-link-in-http-headers
//  https://developer.wordpress.org/reference/functions/rest_output_link_header/
remove_action( "template_redirect", "rest_output_link_header", 11, 0 );


You can find the latest version of my debloat script in my theme's repo.

If there are other things you find helpful to remove, or a better way to organise this file, please drop a comment in the box.



A Self-Hosted Favicon Proxy written in PHP
@edent — Tue, 28 Oct 2025 12:34:54 +0000
In theory, you should be able to get the base favicon of any domain by calling /favicon.ico - but the reality is somewhat more complex than that. Plenty of sites use a wide variety of semi-standardised images which are usually only discoverable from the site's HTML.

There are several services which allow you to get favicons based on a domain. But they all have their problems.


Google


Exposes your user's to Google's tracking.
Relies on redirects.

DuckDuckGo


Not officially supported by DDG.

Favicon.is


No privacy policy whatsoever.

Icons.horse


Paid service.
Only small size icons.

Favicone


No privacy policy.
Only small size icons.



I want to show favicons next to specific links, but I don't want to expose my visitors to unnecessary tracking. How can I proxy these images so they are stored and served locally?

There are a few existing services. Some use Cloudflare workers or other cloud services, there are some local-first ones which are unmaintained.  But nothing modern, self-hosted, and as easy to deploy as uploading a single PHP file.

So here's my attempt to make something which will preserve user privacy, be reasonably fast, and have moderately up-to-date icons, while remaining fast and efficient.

Table of Contents
Getting the domain
Getting the image
Getting the structure right
Preventing abuse
Putting it all together


Getting the domain

Assuming the request comes in to https://proxy.example.com/?domain=bbc.co.uk

PHP has a handy FILTER_VALIDATE_DOMAIN filter which will determine if the string is a domain.

filter_var( $domain, FILTER_VALIDATE_DOMAIN, FILTER_FLAG_HOSTNAME );


Dealing with IDNs

Some domains contain non-ASCII characters - for example https://莎士比亚.org/ - not all favicon services support International Domain Names.

Using the idn_to_ascii() function, it is possible to get the Punycode domain.

$domain = idn_to_ascii("莎士比亚.org");


Getting the image


Check if the icon has previously been downloaded.
Rotate randomly between a few different Favicon services.
Download the icon.
Save it somewhere.


Getting the structure right

I know from my work on OpenBenches that storing tens of thousands of files in a single directory can be problematic. So I'll store the retrieved favicon in: /tld/domain/subdomain/

That will make it quick to see if an icon exists. I'll save the file with a filename based on the current timestamp. That will allow me to check if an icon is out of date, and will prevent people downloading the icons directly from me.

Preventing abuse

I don't want anyone but visitors to my site to be able to use this service. So I'll add a (weak) check to see if the request came from my domain.

$referer = parse_url( $_SERVER["HTTP_REFERER"], PHP_URL_HOST );
if ( $referer == "shkspr.mobi") {
   …
}


Some browsers may not send referers for privacy reasons. So they won't see the favicons. But they probably wouldn't have seen the images loaded from a 3^rd party service. So I'll serve a default image.

Putting it all together

You can grab the code from my personal git service.



Stop using preg_* on HTML and start using \Dom\HTMLDocument instead
@edent — Fri, 09 May 2025 11:34:56 +0000
It is a truth universally acknowledged that a programmer in possession of some HTML will eventually try to parse it with a regular expression.

This makes many people very angry and is widely regarded as a bad move.

In the bad old days, it was somewhat understandable for a PHP coder to run a quick-and-dirty preg_replace() on a scrap of code. They probably could control the input and there wasn't a great way to manipulate an HTML5 DOM.

Rejoice sinners! PHP 8.4 is here to save your wicked souls. There's a new HTML5 Parser which makes everything better and stops you having to write brittle regexen.

Here are a few tips - mostly notes to myself - but I hope you'll find useful.

Sanitise HTML

This is the most basic example. This loads HTML into a DOM, tries to fix all the mistakes it finds, and then spits out the result.

$html = 'Hi
Test
';
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED , "UTF-8" );
echo $dom->saveHTML();


It uses LIBXML_HTML_NOIMPLIED because we don't want a full HTML document with a doctype, head, body, etc.

If you want Pretty Printing, you can use my library.

Get the plain text

OK, so you've got the DOM, how do you get the text of the body without any of the surrounding HTML

$html = 'Hello World!';
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR , "UTF-8" );
echo $dom->body->textContent;


Note, this doesn't replace images with their alt text.

Get a single element

You can use the same querySelector() function as you do in JavaScript!

$element = $dom->querySelector( "h2" );


That returns a pointer to the element. Which means you can run:

$element->setAttribute( "id", "interesting" );
echo $dom->querySelector( "h2" )->attributes["id"]->value;


And you will see that the DOM has been manipulated!

Search for multiple elements

Suppose you have a bunch of headings and you want to get all of them. You can use the same querySelectorAll() function as you do in JavaScript!

To get all headings, in the order they appear:

$headings = $dom->querySelectorAll( "h1, h2, h3, h4, h5, h6" );
foreach ( $headings as $heading ) {
   // Do something
}


Advanced Search

Suppose you have a bunch of links and you want to find only those which point to "example.com/test/". Again, you can use the same attribute selectors as you would elsewhere

$dom->querySelectorAll( "a[href^=https\:\/\/example\.com\/test\/]" );


Replacing content

Sadly, it isn't quite as simple as setting the innerHTML.  Each search returns a node. That node may have children. Those children will also be node which, themselves, may have children, and so on.

Let's take a simple example:

$html = 'Hello';
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );
$element = $dom->querySelector( "h2" );
$element->childNodes[0]->textContent = "Goodbye";
echo $dom->saveHTML();


That changes "Hello" to "Goodbye".

But what if the element has child nodes?

$html = 'Hello friend';
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );
$element = $dom->querySelector( "h2" );
$element->childNodes[0]->textContent = "Goodbye";
echo $dom->saveHTML();


That outputs 
Goodbyefriend
 - so think carefully about the structure of the DOM and what you want to replace.

Adding a new node

This one is tricky!  Let's suppose you have this:


   
      Hello


You want to add an 
 before the . Here's how to do this.

First, you need to construct the DOM:

$html = 'Hello';
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );


Next, you need to construct an entirely new DOM for your new node.

$newHTML = "Title";
$newDom = \Dom\HTMLDocument::createFromString( $newHTML, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );


Next, extract the new element from the new DOM, and import it into the original DOM:

$element = $dom->importNode( $newDom->firstChild, true ); 


The element now needs to be inserted somewhere in the original DOM. In this case, get the h2, tell its parent node to insert the new node before the h2:

$h2 = $dom->querySelector( "h2" );
$h2->parentNode->insertBefore( $element, $h2 );
echo $dom->saveHTML();


Out pops:


   
      Title
      Hello
   



An alternative is to use the appendChild() method. Note that it appends it to the end of the children. For example:

$div = $dom->querySelector( "#page" );
$div->appendChild( $element );
echo $dom->saveHTML();


Produces:


   
      Hello
   
   Title



And more?

I've only scratched the surface of what the new 8.4 HTML Parser can do. I've already rewritten lots of my yucky old preg_ code to something which (hopefully) is less likely to break in catastrophic ways.

If you have any other tips, please leave a comment.



Using Tempest Highlight with WordPress
@edent — Sat, 26 Apr 2025 11:34:19 +0000
I like to highlight bits of code on my blog. I was using GeSHi - but it has ceased to receive updates and the colours it uses aren't WCAG compliant.

After skimming through a few options, I found Tempest Highlight. It has nearly everything I want in a code highlighter:


     PHP with no 3rd party dependencies.
     Lots of common languages.
     Modern, with regular updates.
     Easy to use functions.
     Range of difference style sheets.


But, on the downside:


     No WordPress plugin.
     Not all languages supported.
     CSS embedded in HTML.


I can live without some esoteric languages, but I don't really want to run composer install on my blog. I just want a quick WordPress plugin.  So, here's how I did it.

Table of Contents
Here Be Dragons
The Art of Loading without Loading
Testing
Draw The Rest of the Owl
ToDo
Get the code


Here Be Dragons

This is a quick prototype. It has an audience of one; me. It may break in unexpected ways. Use at your own risk.

The file layout is relatively simple:

WordPress Plugins
├── Highlight_Plugin
│   ├── src/
│   ├── autoload.php
│   ├── index.php
│   └── base.css


The src/ directory contains the src/ directory from Tempest Highlight.

The Art of Loading without Loading

Normally, to install a PHP package, the composer app creates an autoloader which will magically import everything you need into your project.  We can't do that here. Instead, we need to manually load the library.

Create a file in the plugin's directory called autoload.php - its job is to autoload everything in the src/ directory.



I don't know if that's the easiest way to do it. But it works!

Testing

The index.php file can now be tested:

//  Load the Tempest Highlight library
require_once __DIR__ . "/autoload.php";

//  Set up the namespace
use Tempest\Highlight\Highlighter;

//  Define the theme.
$theme = new Tempest\Highlight\Themes\InlineTheme( __DIR__ . "/src/Themes/Css/light-plus.css");

//  Create the highlighter.
$highlighter = new Tempest\Highlight\Highlighter( $theme );

//  Print some formatted HTML
echo $highlighter->parse("test", "html" );


All being well, that should produce this:

<em id='foo' class='bar'>test</em>


That has the CSS embedded. Not ideal, but certainly good enough.  I picked "light-plus" because it was the only theme which seemed to meet at least WCAG AA when on a white background.

OK, so how do we go from printing out a scrap of HTML to extracting all the code snippets from a WordPress blog?

Draw The Rest of the Owl

In theory the code is relatively straightforward.

Find code snippets

My Markdown plugin transforms this:

 ```javascript
 var a = 2.0;
 ``` 


Into this:


var a = 2.0;



No need to use a regex, the new PHP 8.4 HTMLDocument gives us direct programmatic access to the HTML.

//  Load the content into PHP 8.4's HTML DOM.
$dom = Dom\HTMLDocument::createFromString( $content, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );

//  Select the code snippets.
//  ``
$codeSnippets = $dom->querySelectorAll( "pre>code[class^=language-]" );


Replace the snippets

From the above, I have the language and code, so it can "easily" be replaced.

//  Iterate through each snippet.
foreach ( $codeSnippets as $code ) {
    //  Get the HTML from within the .
    $originalCode = $code->textContent;
    //  Replace the contents of  with the highlighted HTML.
    $code->innerHTML = $highlighter->parse( $originalCode, $language )
}


Replacing the code in that node manipulates the original DOM.  Which means, after looping through all the snippets, I can return the altered HTML like so:

return $dom->saveHTML();


And then…

Obviously, there's a bit more too it than that. It ignores RSS feeds, it adds a base CSS style to the head, some SVGs get embedded, semantic metadata is included, and it all gets a bit tangled and complicated.

ToDo

A few things need to happen to make this even better.


Encoded comments as well and posts.
Add new languages.
Don't in-line the CSS into the HTML, but add it as a separate stylesheet.


But, for now, it is running on my blog and that's good enough for me!

Get the code

You can play about with the WordPress plugin. Bugs reports, pull requests, and suggestions all warmly welcomed.



A small PHP update to GeSHi
@edent — Wed, 23 Apr 2025 11:34:53 +0000
The faithful old GeSHi Syntax Highlighter hasn't seen an update in a many a long year.  It's a tried and trusted way to do server-side code highlighting - turning a myriad of programming languages into beautiful HTML & CSS.

A few weeks ago, I noticed someone had proposed an update to its HTML rendering. The changes were mostly adding in new element names.

PHP has been updated several times since GeSHi was last updated, so I thought I'd do the same. Here's an update to the PHP highlighter.

Getting all the current PHP functions was fairly simple:

$functions = get_defined_functions();
$builtInFunctions = $functions['internal'];
sort($builtInFunctions);
foreach ( $builtInFunctions as $key => $value ) {
   echo "'{$value}', "; 
}


Now I'm wondering if there's a better code highlighter.  Here's what I'm looking for:


Server-side. I don't want to clutter the web with JavaScript.
PHP only. I don't want to add something more complicated to my tech stack.
WordPress for preference (but not blocks-only). Although I can build around a library.
Accessible colours. GeSHi's style-sheet doesn't always meet WCAG.
Actively maintained. If it hasn't been updated in 2 years, it's probably broken.
Somewhat hackable. I like to add a bit of semantic fluff around the output.


Any thoughts?



Introducing Pretty Print HTML for PHP 8.4
@edent — Sat, 19 Apr 2025 11:34:54 +0000
I'm delight to announce the first release of my opinionated HTML Pretty Printer for new versions of PHP.


Grab the code from Packagist
Contribute on GitLab


There are several prettifiers on Packagist, but I think mine is the only one which works with the new Dom\HTMLDocument class.

Table of Contents
What
How
Limitations
Why
Next Steps


What

This takes hard-to-read HTML like:

Title
How exciting!


And pretty-prints it with some opinionated formatting:



    
        
    
    
        
            Title
            How exciting!
        
    



All elements are indented where possible. Attributes are sorted alphabetically. Attribute variables are unquoted if possible. CSS and JS are unaltered. These options are configurable.

To get an idea of what it outputs, take a look at the source code of this page!

How

This is designed to be simple to use, but with enough options to be useful to as many people as possible.

//  HTML as a string:
$html = "This is  an example";
//  Or as a file:
$html = file_get_contents( "example.html" );

//  Turn the HTML into a Dom\HTMLDocument
$dom = \Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR, "UTF-8" );

//  Create the pretty printer
$formatter = new Edent\PrettyPrintHtml\PrettyPrintHtml();

//  Output the result
echo $formatter->serializeHtml( $dom );


Limitations

Whitespace is hard. There are many different types. Sometimes it is for display, sometimes it isn't. Adding extra newlines and tabs almost certainly will cause layout changes somewhere on your page.

You can either change your CSS to minimise this, add elements to the preserveElements list to stop them being altered, or re-write your original HTML.  The choice is yours.

Why

As was written long ago:

A computer language is not just a way of getting a computer to perform operations but rather … it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.

PHP's new Dom\HTMLDocument class produces syntactically valid HTML code. The code is very easy for a computer to parse. But because there is no indenting, the code is difficult for a human to parse.

Adding newlines and indents before every new element can introduce spacing errors when the HTML is rendered to screen. Some of these can be fixed with extra CSS, some cannot

This pretty-printer attempts to make code readable for humans by striking a balance between legibility when rendered on screen or viewed as source code.

Why is human readability so important?

As Ana Rodrigues said:

Today's heavily optimized websites have largely killed the "view source" learning experience. The code is minified, bundled, and often incomprehensible to beginners trying to understand how things work. […] I want anyone, regardless of skill level, to inspect elements, understand the structure, and learn from readable code.

Using this pretty printer should give you and your users an excellent "view source" experience, without sacrificing the browser's ability to render the code.

Next Steps

I'm sure there are many bugs and oddities. I'd love you to report any problems on GitLab. Feel free to contribute test-cases and code.



An opinionated HTML Serializer for PHP 8.4
@edent — Wed, 02 Apr 2025 11:34:36 +0000
A few days ago, I wrote a shitty pretty-printer for PHP 8.4's new Dom\HTMLDocument class.

I've since re-written it to be faster and more stylistically correct.

It turns this:

TestTesting
Some HTML and an 
Text not in an elementList
Another list


Into this:



    
        Test
    
    
        Testing
        
            
                Some 
                HTML
                 and an 
                
            
            Text not in an element
            
                List
                Another list
            
        
    



I say it is "opinionated" because it does the following:


Attributes are unquoted unless necessary.
Every element is logically indented.
Text content of CSS and JS is unaltered. No pretty-printing, minification, or checking for correctness.
Text content of elements may have extra newlines and tabs. Browsers will tend to ignore multiple whitespaces unless the CSS tells them otherwise.


This fucks up  blocks which contain markup.



It is primarily designed to make the markup easy to read. Because according to the experts:

A computer language is not just a way of getting a computer to perform operations but rather … it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.

I'm fairly sure this all works properly. But feel free to argue in the comments or send me a pull request.

Here's how it works.

When is an element not an element? When it is a void!

Modern HTML has the concept of "Void Elements". Normally, something like  must eventually be followed by a closing .  But Void Elements don't need closing.

This keeps a list of elements which must not be explicitly closed.

$void_elements = [
    "area",
    "base",
    "br",
    "col",
    "embed",
    "hr",
    "img",
    "input",
    "link",
    "meta",
    "param",
    "source",
    "track",
    "wbr",
];


Tabs 🆚 Space

Tabs, obviously. Users can set their tab width to their personal preference and it won't get confused with semantically significant whitespace.

$indent_character = "\t";


Setting up the DOM

The new HTMLDocument should be broadly familiar to anyone who has used the previous one.

$html = 'TestTesting
Some HTML and an 
Text not in an elementList
Another list>'
$dom = Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR, "UTF-8" );


This automatically adds  and  elements. If you don't want that, use the LIBXML_HTML_NOIMPLIED flag:

$dom = Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );


To Quote or Not To Quote?

Traditionally, HTML attributes needed quotes:




Modern HTML allows those attributes to be unquoted as long as they don't contain ASCII Whitespace or certain other characters

For example, the above becomes:




This function looks for the presence of those characters:

function value_unquoted( $haystack )
{
    //  Must not contain specific characters

    $needles = [ 
        //  https://infra.spec.whatwg.org/#ascii-whitespace
        "\t", "\n", "\f", "\n", " ", 
        //  https://html.spec.whatwg.org/multipage/syntax.html#unquoted 
        "\"", "'", "=", "<", ">", "`" ];
    foreach ( $needles as $needle )
    {
        if ( str_contains( $haystack, $needle ) )
        {
            return false;
        }
    }
    //  Must not be null
    if ( $haystack == null ) { return false; }
    return true;
}


Re-re-re-recursion

I've tried to document this as best I can.

It traverses the DOM tree, printing out correctly indented opening elements and their attributes. If there's text content, that's printed. If an element needs closing, that's printed with the appropriate indentation.

function serializeHTML( $node, $treeIndex = 0, $output = "")
{
    global $indent_character, $preserve_internal_whitespace, $void_elements;

    //  Manually add the doctype to start.
    if ( $output == "" ) {
        $output .= "\n";
    }

    if( property_exists( $node, "localName" ) ) {
        //  This is an Element.

        //  Get all the Attributes (id, class, src, &c.).
        $attributes = "";
        if ( property_exists($node, "attributes")) {
            foreach( $node->attributes as $attribute ) {
                $value = $attribute->nodeValue;
                //  Only add " if the value contains specific characters.
                $quote = value_unquoted( $value ) ? "" : "\"";

                $attributes .= " {$attribute->nodeName}={$quote}{$value}{$quote}";
            }
        }

        //  Print the opening element and all attributes.
        $output .= "<{$node->localName}{$attributes}>";

    } else if( property_exists( $node, "nodeName" ) &&  $node->nodeName == "#comment" ) {
        //  Comment
        $output .= "";
    }

    //  Increase indent.
    $treeIndex++;
    $tabStart = "\n" . str_repeat( $indent_character, $treeIndex ); 
    $tabEnd   = "\n" . str_repeat( $indent_character, $treeIndex - 1);

    //  Does this node have children?
    if( property_exists( $node, "childElementCount" ) && $node->childElementCount > 0 ) {

        //  Loop through the children.
        $i=0;
        while( $childNode = $node->childNodes->item( $i++ ) ) {

            //  Is this a text node?
            if ($childNode->nodeType == 3 ) {
                //  Only print output if there's no HTML inside the content.
                //  Ignore Void Elements.
                if ( 
                      !str_contains( $childNode->textContent, "<" ) && 
                    property_exists( $childNode, "localName" ) && 
                          !in_array( $childNode->localName, $void_elements ) ) 
                {
                    $output .= $tabStart . $childNode->textContent;
                }
            } else {
                $output .= $tabStart;
            }

            //  Recursively indent all children.
            $output = serializeHTML( $childNode, $treeIndex, $output );
        };

        //  Suffix with a "\n" and a suitable number of "\t"s.
        $output .= "{$tabEnd}"; 

    } else if ( property_exists( $node, "childElementCount" ) && property_exists( $node, "innerHTML" ) ) {
        //  If there are no children and the node contains content, print the contents.
        $output .= $node->innerHTML;
    }

    //  Close the element, unless it is a void.
    if( property_exists( $node, "localName" ) && !in_array( $node->localName, $void_elements ) ) {
        $output .= "localName}>";
    }

    //  Return a string of fully indented HTML.
    return $output;
}


Print it out

The serialized string hardcodes the  - which is probably fine.  The full HTML is shown with:

echo serializeHTML( $dom->documentElement );


Next Steps

Please raise any issues on GitLab or leave a comment.



Pretty Print HTML using PHP 8.4's new HTML DOM
@edent — Mon, 31 Mar 2025 11:34:54 +0000
Those whom the gods would send mad, they first teach recursion.

PHP 8.4 introduces a new Dom\HTMLDocument class it is a modern HTML5 replacement for the ageing XHTML based DOMDocument.  You can read more about how it works - the short version is that it reads and correctly sanitises HTML and turns it into a nested object. Hurrah!

The one thing it doesn't do is pretty-printing.  When you call $dom->saveHTML() it will output something like:

TestTesting
Some HTML and an 
List
Another list


Perfect for a computer to read, but slightly tricky for humans.

As was written by the sages:

A computer language is not just a way of getting a computer to perform operations but rather … it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.

HTML is a programming language. Making markup easy to read for humans is a fine and noble goal.  The aim is to turn the single line above into something like:


    
        Test
    
    
        Testing
        
            Some HTML and an 
            
                List
                Another list
            
        
    



Cor! That's much better!

I've cobbled together a script which is broadly accurate. There are a million-and-one edge cases and about twice as many personal preferences. This aims to be quick, simple, and basically fine. I am indebted to this random Chinese script and to html-pretty-min.

Step By Step

I'm going to walk through how everything works. This is as much for my benefit as for yours! This is beta code. It sorta-kinda-works for me. Think of it as a first pass at an attempt to prove that something can be done. Please don't use it in production!

Setting up the DOM

The new HTMLDocument should be broadly familiar to anyone who has used the previous one.

$html = 'TestTestingSome HTML and an ListAnother list'
$dom = Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR, "UTF-8" );


This automatically adds  and  elements. If you don't want that, use the LIBXML_HTML_NOIMPLIED flag:

$dom = Dom\HTMLDocument::createFromString( $html, LIBXML_NOERROR | LIBXML_HTML_NOIMPLIED, "UTF-8" );


Where not to indent

There are certain elements whose contents shouldn't be pretty-printed because it might change the meaning or layout of the text. For example, in a paragraph:


    Some 
    
        HT
        M
        L
    



I've picked these elements from text-level semantics and a few others which I consider sensible. Feel free to edit this list if you want.

$preserve_internal_whitespace = [
    "a", 
    "em", "strong", "small", 
    "s", "cite", "q", 
    "dfn", "abbr", 
    "ruby", "rt", "rp", 
    "data", "time", 
    "pre", "code", "var", "samp", "kbd", 
    "sub", "sup", 
    "b", "i", "mark", "u",
    "bdi", "bdo", 
    "span",
    "h1", "h2", "h3", "h4", "h5", "h6",
    "p",
    "li",
    "button", "form", "input", "label", "select", "textarea",
];


The function has an option to force indenting every time it encounters an element.

Tabs 🆚 Spaces

Tabs, obviously. Users can set their tab width to their personal preference and it won't get confused with semantically significant whitespace.

$indent_character = "\t";


Recursive Function

This function reads through each node in the HTML tree. If the node should be indented, the function inserts a new node with the requisite number of tabs before the existing node. It also adds a suffix node to indent the next line appropriately. It then goes through the node's children and recursively repeats the process.

This modifies the existing Document.

function prettyPrintHTML( $node, $treeIndex = 0, $forceWhitespace = false )
{    
    global $indent_character, $preserve_internal_whitespace;

    //  If this node contains content which shouldn't be separately indented
    //  And if whitespace is not forced
    if ( property_exists( $node, "localName" ) && in_array( $node->localName, $preserve_internal_whitespace ) && !$forceWhitespace ) {
        return;
    }

    //  Does this node have children?
    if( property_exists( $node, "childElementCount" ) && $node->childElementCount > 0 ) {
        //  Move in a step
        $treeIndex++;
        $tabStart = "\n" . str_repeat( $indent_character, $treeIndex ); 
        $tabEnd   = "\n" . str_repeat( $indent_character, $treeIndex - 1);

        //  Remove any existing indenting at the start of the line
        $node->innerHTML = trim($node->innerHTML);

        //  Loop through the children
        $i=0;

        while( $childNode = $node->childNodes->item( $i++ ) ) {
            //  Was the *previous* sibling a text-only node?
            //  If so, don't add a previous newline
            if ( $i > 0 ) {
                $olderSibling = $node->childNodes->item( $i-1 );

                if ( $olderSibling->nodeType == XML_TEXT_NODE  && !$forceWhitespace ) {
                    $i++;
                    continue;
                }
                $node->insertBefore( $node->ownerDocument->createTextNode( $tabStart ), $childNode );
            }
            $i++; 
            //  Recursively indent all children
            prettyPrintHTML( $childNode, $treeIndex, $forceWhitespace );
        };

        //  Suffix with a node which has "\n" and a suitable number of "\t"
        $node->appendChild( $node->ownerDocument->createTextNode( $tabEnd ) ); 
    }
}


Printing it out

First, call the function.  This modifies the existing Document.

prettyPrintHTML( $dom->documentElement );


Then call the normal saveHtml() serialiser:

echo $dom->saveHTML();


Note - this does not print a  - you'll need to include that manually if you're intending to use the entire document.

Licence

I consider the above too trivial to licence - but you may treat it as MIT if that makes you happy.

Thoughts? Comments? Next steps?

I've not written any formal tests, nor have I measured its speed, there may be subtle-bugs, and catastrophic errors. I know it doesn't work well if the HTML is already indented. It mysteriously prints double newlines for some unfathomable reason.

I'd love to know if you find this useful. Please get involved on GitLab or drop a comment here.



Create a Table of Contents based on HTML Heading Elements
@edent — Wed, 26 Mar 2025 12:34:31 +0000
Some of my blog posts are long⁰. They have lots of HTML headings like 
 and . Say, wouldn't it be super-awesome to have something magically generate a Table of Contents?  I've built a utility which runs server-side using PHP. Give it some HTML and it will construct a Table of Contents.

Let's dive in!

Table of Contents
BackgroundHeading Example
What is the purpose of a table of contents?
CodeLoad the HTMLUsing PHP 8.4
Parse the HTMLPHP 8.4 querySelectorAll
Recursive loopingMissing content
Converting to HTML
Semantic CorrectnessePub Example
Split the difference with a menu
Where should the heading go?
Conclusion


Background

HTML has six levels of headings¹ - 
 is the main heading for content,  is a sub-heading,  is a sub-sub-heading, and so on.

Together, they form a hierarchy.

Heading Example

HTML headings are expected to be used a bit like this (I've nested this example so you can see the hierarchy):

The Theory of Everything
   Experiments
      First attempt
      Second attempt
   Equipment
      Broken equipment
         Repaired equipment
      Working Equipment
…


What is the purpose of a table of contents?

Wayfinding. On a long document, it is useful to be able to see an overview of the contents and then immediately navigate to the desired location.

The ToC has to provide a hierarchical view of all the headings and then link to them.

Code

I'm running this as part of a WordPress plugin. You may need to adapt it for your own use.

Load the HTML

This uses PHP's DOMdocument. I've manually added a UTF-8 header so that Unicode is preserved. If your HTML already has that, you can remove the addition from the code.

//  Load it into a DOM for manipulation
$dom = new DOMDocument();
//  Suppress warnings about HTML errors
libxml_use_internal_errors( true );
//  Force UTF-8 support
$dom->loadHTML( "" . $content, LIBXML_NOERROR | LIBXML_NOWARNING );
libxml_clear_errors();


Using PHP 8.4

The latest version of PHP contains a better HTML-aware DOM. It can be used like this:

$dom = Dom\HTMLDocument::createFromString( $content, LIBXML_NOERROR , "UTF-8" );


Parse the HTML

It is not a good idea to use Regular Expressions to parse HTML - no matter how well-formed you think it is. Instead, use XPath to extract data from the DOM.

//  Parse with XPath
$xpath = new DOMXPath( $dom );

//  Look for all h* elements
$headings = $xpath->query( "//h1 | //h2 | //h3 | //h4 | //h5 | //h6" );


This produces an array with all the heading elements in the order they appear in the document.

PHP 8.4 querySelectorAll

Rather than using XPath, modern versions of PHP can use querySelectorAll:

$headings = $dom->querySelectorAll( "h1, h2, h3, h4, h5, h6" );


Recursive looping

This is a bit knotty. It produces a nested array of the elements, their id attributes, and text.  The end result should be something like:

array (
  array (
    'text' => 'Table of Contents',
    'raw' => true,
  ),
  array (
    'text' => 'The Theory of Everything',
    'id' => 'the-theory-of-everything',
    'children' => 
    array (
      array (
        'text' => 'Experiments',
        'id' => 'experiments',
        'children' => 
        array (
          array (
            'text' => 'First attempt',
            'id' => 'first-attempt',
          ),
          array (
            'text' => 'Second attempt',
            'id' => 'second-attempt',


The code is moderately complex, but I've commented it as best as I can.

//  Start an array to hold all the headings in a hierarchy
$root = [];
//  Add an h2 with the title
$root[] = [
    "text"     => "Table of Contents", 
    "raw"      => true, 
    "children" => []
];

// Stack to track current hierarchy level
$stack = [&$root]; 

//  Loop through the headings
foreach ($headings as $heading) {

    //  Get the information
    //  Expecting Text
    $element = $heading->nodeName;  //  e.g. h2, h3, h4, etc
    $text    = trim( $heading->textContent );   
    $id      = $heading->getAttribute( "id" );

    //  h2 becomes 2, h3 becomes 3 etc
    $level = (int) substr($element, 1);

    //  Get data from element
    $node = array( 
        "text"     => $text, 
        "id"       => $id , 
        "children" => [] 
    );

    //  Ensure there are no gaps in the heading hierarchy
    while ( count( $stack ) > $level ) {
        array_pop( $stack );
    }

    //  If a gap exists (e.g., h4 without an immediately preceding h3), create placeholders
    while ( count( $stack ) < $level ) {
        //  What's the last element in the stack?
        $stackSize = count( $stack );
        $lastIndex = count( $stack[ $stackSize - 1] ) - 1;
        if ($lastIndex < 0) {
            //  If there is no previous sibling, create a placeholder parent
            $stack[$stackSize - 1][] = [
                "text"     => "",   //  This could have some placeholder text to warn the user?
                "children" => []
            ];
            $stack[] = &$stack[count($stack) - 1][0]['children'];
        } else {
            $stack[] = &$stack[count($stack) - 1][$lastIndex]['children'];
        }
    }

    //  Add the node to the current level
    $stack[count($stack) - 1][] = $node;
    $stack[] = &$stack[count($stack) - 1][count($stack[count($stack) - 1]) - 1]['children'];
}


Missing content

The trickiest part of the above is dealing with missing elements in the hierarchy. If you're sure you don't ever skip from an 
 to an , you can get rid of some of the code dealing with that edge case.

Converting to HTML

OK, there's a hierarchical array, how does it become HTML?

Again, a little bit of recursion:

function arrayToHTMLList( $array, $style = "ul" )
{
    $html = "";

    //  Loop through the array
    foreach( $array as $element ) {
        //  Get the data of this element
        $text     = $element["text"];
        $id       = $element["id"];
        $children = $element["children"];
        $raw      = $element["raw"] ?? false;

        if ( $raw ) {
            //  Add it to the HTML without adding an internal link
            $html .= "{$text}";
        } else {
            //  Add it to the HTML
            $html .= "{$text}";
        }

        //  If the element has children
        if ( sizeof( $children ) > 0 ) {
            //  Recursively add it to the HTML
            $html .=  "<{$style}>" . arrayToHTMLList( $children, $style ) . "";
        } 
    }

    return $html;
}


Semantic Correctness

Finally, what should a table of contents look like in HTML?  There is no  element, so what is most appropriate?

ePub Example

Modern eBooks use the ePub standard which is based on HTML. Here's how an ePub creates a ToC.


Table of Contents

  
    A simple link
  
  …




The modern(ish) 
 element!

The nav element represents a section of a page that links to other pages or to parts within the page: a section with navigation links.
HTML Specification

But there's a slight wrinkle. The ePub example above use 
 an ordered list. The HTML example in the spec uses  an unordered list.

Which is right? Well, that depends on whether you think the contents on your page should be referred to in order or not. There is, however, a secret third way.

Split the difference with a menu

I decided to use the 
 element for my navigation. It is semantically the same as  but just feels a bit closer to what I expect from navigation. Feel free to argue with me in the comments.

Where should the heading go?

I've put the title of the list into the list itself. That's valid HTML and, if my understanding is correct, should announce itself as the title of the navigation element to screen-readers and the like.

Conclusion

I've used slightly more heading in this post than I would usually, but hopefully the Table of Contents at the top demonstrates how this works.

If you want to reuse this code, I consider it too trivial to licence. But, if it makes you happy, you can treat it as MIT.

Thoughts? Comments? Feedback? Drop a note in the box.






Too long really, but who can be bothered to edit? ↩︎



Although Paul McCartney disagrees. ↩︎







Change the way dates are presented in WordPress's admin view
@edent — Wed, 26 Feb 2025 12:34:21 +0000
WordPress does not respect an admin's preferred date format.

Here's how the admin list of posts looks to me:



I don't want it to look like that. I want it in RFC3339 format.

I know what you're thinking, just change the default date display - but that only seems to work in some areas of WordPress. It doesn't change the column-date format.  Here's what mine is set to:



So that doesn't work.

Instead, you need to use the slightly obscure post_date_column_time filter

Add this to your theme's functions.php:

//  Admin view - change date format
function rfc3339_post_date_time( $time, $post ) {
    //  Modify the default time format
    $rfc3339_time = date( "Y-m-d H:i", strtotime( $post->post_date ) );
    return $rfc3339_time;
}
add_filter( "post_date_column_time", "rfc3339_post_date_time", 10, 2 );


And, hey presto, your date column will look like this:


Obviously, you can change that code to whichever date format you prefer.



Graphing the connections between my blog posts
@edent — Thu, 09 Jan 2025 12:34:56 +0000
I love ripping off good ideas from other people's blogs.  I was reading Alvaro Graves-Fuenzalida's blog when I saw this nifty little force-directed graph:



When zoomed in, it shows the relation between posts and tags.



In this case, I can see that the posts about Small Gods and Pyramids both share the tags of Discworld, Fantasy, and Book Review. But only Small Gods has the tag of Religion.

Isn't that cool! It is a native feature of Quartz's GraphView. How can I build something like that for my WordPress blog?

Aim

Create an interactive graph which shows the relationship between a post, its links, and their tags.

It will end up looking something like this:



You can get the code or follow along to see how it works.

This is a multi-stage process. Let's begin!

What We Need

When on a single Post, we need the following:


The tags assigned to that Post.
Internal links back to that Post.
Internal links from that Post.
The tags assigned to links to and from that Post.


Tags assigned to that Post.

This is pretty easy!  Using the get_the_tag_list() function we can, unsurprisingly, get all the tags associated with a post.

$post_tags_text = get_the_tag_list( "", ",", $ID );
$post_tags_array = explode( "," , $post_tags_text );


That just gets the list of tag names. If we want the tag IDs as well, we need to use the get_the_tags() function.

$post_tags = get_the_tags($ID);
$tags = array();
foreach($post_tags as $tag) {
    $tags[$tag->term_id] = $tag->name; 
}


Backlinks

Internal links back to the Post is slightly trickier. WordPress doesn't save relational information like that. Instead, we get the Post's URl and search for that in the database. Then we get the post IDs of all the posts which contain that string.

//  Get all the posts which link to this one, oldest first
$the_query = new WP_Query(
    array(
        's' => $search_url,
        'post_type' => 'post',
        "posts_per_page" => "-1",
        "order" => "ASC"
    )
);

//  Nothing to do if there are no inbound links
if ( !$the_query->have_posts() ) {
    return;
}


Backlinks' Tags

Once we have an array of posts which link back here, we can get their tags as above:

//  Loop through the posts
while ( $the_query->have_posts() ) {
    //  Set it up
    $the_query->the_post();
    $id                  = get_the_ID();
    $title               = esc_html( get_the_title() );
    $url                 = get_the_permalink();
    $backlink_tags_text  = get_the_tag_list( "", ",", $ID );
    $backlink_tags_array = explode( "," , $backlink_tags_text );
}


Links from the Post

Again, WordPress's lack of relational links is a weakness. In order to get internal links, we need to:


Render the HTML using all the filters
Search for all 

Extract the ones which start with the blog's domain
Get those posts' IDs.


Rendering the content into HTML is done with:

$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );


Searching for links is slightly more complex. The easiest way is to load the HTML into a DOMDocument, then extract all the anchors. All my blog posts start /blog/YYYY so I can avoid selecting links to tags, uploaded files, or other things. Your blog may be different.

$dom = new DOMDocument();
libxml_use_internal_errors( true ); //  Suppress warnings from malformed HTML
$dom->loadHTML( $content );
libxml_clear_errors();

$links = [];
foreach ( $dom->getElementsByTagName( "a" ) as $anchor ) {
    $href = $anchor->getAttribute( "href" );
    if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) {
        $links[] = $href;
    }
}


The ID of each post can be found with the url_to_postid() function. That means we can re-use the earlier code to see what tags those posts have.

Building a graph

OK, so we have all our constituent parts. Let's build a graph!

Graphs consist of nodes (posts and tags) and edges (links between them). The exact format of the graph is going to depend on the graph library we use.

I've decided to use D3.js's Force Graph as it is relatively simple and produces a reasonably good looking interactive SVG.

Imagine there are two blog posts and two hashtags.

const nodes = [
    { id: 1, label: "Blog Post 1",    url: "https://example.com/post/1", group: "post" },
    { id: 2, label: "Blog Post 2",    url: "https://example.com/post/2", group: "post" },
    { id: 3, label: "hashtag",        url: "https://example.com/tag/3",  group: "tag"  },
    { id: 4, label: "anotherHashtag", url: "https://example.com/tag/4",  group: "tag"  },
];



Blog Post 1 links to Blog Post 2.
Blog Post 1 has a #hashtag.
Both 1 & 2 share #anotherHashtag.


const links = [
    { source: 1, target: 2 },
    { source: 3, target: 1 },
    { source: 4, target: 1 },
    { source: 4, target: 2 },
];


Here's how to create a list of nodes and their links.  You will need to edit it for your own blog's peculiarities.

 $id, 
        "label" => $label, 
        "url"   => $url, 
        "group" => $group
    ];

}

//  Function to add new relationships
function add_relationship( &$links, $source, $target ) {
    $links[] = [
        "source" => $source,
        "target" => $target
    ];
}

//  Add Post to the nodes
add_item_to_nodes( $nodes, $main_post_id, $main_post_title, $main_post_url, "post" );

//  Get the tags of the Post
$main_post_tags = get_the_tags( $main_post_id );

//  Add the tags as nodes, and create links to main Post
foreach( $main_post_tags as $tag ) {
    $id   = $tag->term_id;
    $name = $tag->name;

    //  Add the node
    add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
    //  Add the relationship
    add_relationship( $links, $id, $main_post_id );
}

//  Get all the posts which link to this one, oldest first
$the_query = new WP_Query(
    array(
        's'              => $main_post_url,
        'post_type'      => 'post',
        "posts_per_page" => "-1",
        "order"          => "ASC"
    )
);

//  Nothing to do if there are no inbound links
if ( $the_query->have_posts() ) {
    //  Loop through the posts
    while ( $the_query->have_posts() ) {
        //  Set up the query
        $the_query->the_post();
        $post_id = get_the_ID();
        $title = esc_html( get_the_title() );
        $url   = get_the_permalink();

        //  Add the node
        add_item_to_nodes( $nodes, $post_id, $title, $url, "post" );
        //  Add the relationship
        add_relationship( $links, $post_id, $main_post_id );

        //  Get the tags of the Post
        $post_tags = get_the_tags( $post_id );

        //  Add the tags as nodes, and create links to main Post
        foreach($post_tags as $tag) {

            $id   = $tag->term_id;
            $name = $tag->name;

            //  Add the node
            add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
            //  Add the relationship
            add_relationship( $links, $id, $post_id );
        }

    }
}

//  Get all the internal links from this post
//  Render the post as HTML
$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );

//  Load it into HTML
$dom = new DOMDocument();
libxml_use_internal_errors( true );
$dom->loadHTML( $content );
libxml_clear_errors();

//  Get any getElementsByTagName( "a" ) as $anchor ) {
    $href = $anchor->getAttribute( "href" );
    if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) {
        $internal_links[] = $href;
    }
}

//  Loop through the internal links, get their hashtags
foreach ( $internal_links as $url ) {
    $post_id = url_to_postid( $url );
    //  Get the Post's details
    $post_title = get_the_title( $id );

    //  Add the node
    add_item_to_nodes( $nodes, $post_id, $post_title, $url, "post" );
    //  Add the relationship
    add_relationship($links, $main_post_id, $post_id );

    //  Get the tags of the Post
    $post_tags = get_the_tags( $post_id );

    //  Add the tags as nodes, and create links to main Post
    foreach( $post_tags as $tag ) {
        $id   = $tag->term_id;
        $name = $tag->name;

        //  Add the node
        add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
        //  Add the relationship
        add_relationship( $links, $id, $post_id );
    }
}

//  Deduplicate the nodes and links
$nodes_unique = array_unique( $nodes, SORT_REGULAR );
$links_unique = array_unique( $links, SORT_REGULAR );

//  Put them in the keyless format that D3 expects
$nodes_output = array();
$links_output = array();

foreach ( $nodes_unique as $node ) {
    $nodes_output[] = $node;
}

foreach ( $links_unique as $link ) {
    $links_output[] = $link;
}

//  Return the JSON
echo json_encode( $nodes_output, JSON_PRETTY_PRINT );
echo "\n";
echo json_encode( $links_output, JSON_PRETTY_PRINT );


Creating a Force Directed SVG

Once the data are spat out, you can include them in a web-page. Here's a basic example:



    
        
        
        Force Directed Graph
        
    
    
        
        
    



Next Steps

It needs a bit of cleaning up if I want to turn it into a WordPress plugin. It might be nice to make it a static SVG rather than relying on JavaScript. And the general æsthetic needs a bit of work.

Perhaps I could make it 3D like my MSc Dissertation?

But I'm pretty happy with that for an afternoon hack!

You can get the code if you want to play.



Order WordPress Posts by Most Comments
@edent — Thu, 12 Dec 2024 12:34:42 +0000
I take great delight in seeing people reply to my blog posts.  I use WebMentions to collect replies from social media and other sites. But which of my posts has the most comments? Here's a snipped to stick in your functions.php file. It allows you to add ?comment-order to any WordPress URl and have the posts with the most comments on top.

//  Add ordering by comments
add_action( 'pre_get_posts', 'pre_get_posts_by_comments' );
function pre_get_posts_by_comments( $query ) {
    //  Do nothing if the post_status parameter in the URL is not "comment-order"
    if ( ! isset( $_GET['comment-order'] ) ) {
        return;
    }

    $query->set( "orderby", "comment_count" );  //  Default: date
    $query->set( "order", "DESC" ); //  Biggest first
}


This makes use of the pre_get_posts hook to rewrite the posts query. That means it works on most WordPress pages.

For example:


My homepage https://shkspr.mobi/blog/?comment-order
Posts with a specific tag https://shkspr.mobi/blog/tag/blockchain/?comment-order
Dates https://shkspr.mobi/blog/2012/?comment-order


Did you find this post useful? Please leave a comment here!



Change WordPress Fragment Links in RSS Feeds to be Permalinks
@edent — Fri, 06 Dec 2024 12:34:14 +0000
Here's a knotty problem. Lots of my posts use URl Fragments. Those are links which start with #. They allow me to write:

go straight to the relevant section.  For example, they might want to skip straight to how to fix it.

Isn't that clever?

Where is this a problem?

This works great when someone is on my website. They're on the page, and a fragment links straight to the correct section of that page.

But some people view this blog in RSS & Atom feeds - and those feeds also power my newsletter.

When those people see a fragment, it is devoid of its original context. So they end up going to some random location, or my homepage.

How to fix it?

Stick this into your WordPress theme's functions.php file:

//  In the RSS feed, change #whatever to #whatever
function rewrite_fragment_links_in_rss($content) {
    global $post;

    //  Ensure this is a feed
    if ( is_feed() && $post instanceof WP_Post ) {
        //  Get the permalink
        $base_url = get_permalink( $post );

        //  Regex to get href="https://shkspr.mobi/blog/2024/12/change-wordpress-fragment-links-in-rss-feeds-to-be-permalinks/#%20%20%20%20%20%20%20%20$content%20=%20preg_replace_callback(%20%20%20%20%20%20%20%20%20%20%20%20"/href=["\']#([^"\']+)["\']/',
            function ( $matches ) use ( $base_url ) {
                return 'href="' . esc_url( $base_url . '#' . $matches[1] ) . '"';
            },
            $content
        );
    }

    return $content;
}

//  Hook into feed filters for both excerpts and full content
add_filter( "the_excerpt_rss",  "rewrite_fragment_links_in_rss" );
add_filter( "the_content_feed", "rewrite_fragment_links_in_rss" );


That listens out for the RSS feed being generated and replaces #whatever with https://shkspr.mobi/blog/2024/12/change-wordpress-fragment-links-in-rss-feeds-to-be-permalinks#whatever

Nifty!

Hopefully, if you click on the links in my emails and feeds, it should take you to the right place now.



A simple and free way to post RSS feeds to Threads
@edent — Wed, 06 Nov 2024 12:30:00 +0000
Threads is Meta's attempt to disrupt the social media landscape. Whether you care for it or not, there are a lot of users there. And, sometimes, you have to go where the audience is.

Here's how I build a really simple PHP tool to post to Threads using their official API.  This allows you to send a single status update programatically, or regularly send new items from your RSS feed to an account.

You can see the bot in action at https://www.threads.net/@openbenches_org

Get the code

The code is available as Open Source. It should be fairly self explanatory for a moderately competent programmer - but feel free to open an issue if you think it is confusing.

Get it working


Create an account on Threads (duh!) - this involves signing up to Instagram.
Create a Facebook Developer account.
Create an app which requests the Threads posting API.


You do not need to publish this app if you're only using it yourself.

Create a User Token using the "User Token Generator"
Get your Threads account's User ID with:


curl -s -X GET "https://graph.threads.net/v1.0/me?ields=id,username,name,threads_profile_picture_url,threads_biography&access_token=TOKEN"
(Yes, ields. If you use fields you get something else!)

Clone the RSS2Threads repo and stick it on a webserver somewhere.
Rename config.sample.php to config.php and add your feeds' details, along with your ID and Token.
Run php rss2threads.php


And that's it!

The service will download your RSS feed, check if it has posted the entries to Threads and, if not, post them.

How I built it

Shoulders of giants, and all that! I have been using Thomas Nesges's RSS2BSky for auto-posting to BlueSky. I also used Jesse Chen's Python Threads example code.

Posting is a two stage process.


POST the URl encoded text to:


https://graph.threads.net/USER_ID/threads?text=My%20post&access_token=TOKEN&media_type=TEXT
If successful, the API will return a Creation ID.

POST the Creation ID to:


https://graph.threads.net/USER_ID/threads_publish?creation_id=CREATION_ID&access_token=TOKEN
If successful, the API will return a Post ID.



Successful RSS posts are stored in a simple SQLite database. If an RSS entry was posted successfully, it won't be reposted.

Caveats


There are no unit tests, fuzzing, or exception handling. It's assumed you're running this on well-formed RSS that you trust.
The Threads API is slow! It takes ages for a post to be sent to it.
Getting a Threads API token is difficult and the margin is too small for me to explain it here.


Feedback

Please leave a comment here or on the code repository.



Using phpList for a blog's newsletter
@edent — Thu, 31 Oct 2024 12:34:36 +0000
Some people like to receive this blog via email. I previously used JetPack to send out subscriber messages - but it became increasingly clear that Automattic isn't a good steward of such things.  I couldn't find any services which would let me send a few thousand subscribers a few emails per week, at zero cost.

So, redecentralise!

I installed phpList which is an open source email campaign tool.  My webhost - Krystal - had a one-click install option. But, phpList isn't quite one-click for sending out a regular blog newsletter.  I found the set-up to be quite confusing, so here are the steps I took to turn an RSS feed into an Email Newsletter for free.

Install the plugins


Navigate to Config → Manage plugins
Enable "CommonPlugin"
Add the RSS Feed Plugin using the Plugin package URL https://github.com/bramley/phplist-plugin-rssfeed/archive/master.zip


Configure the RSS Feed Plugin


Navigate to Config → Settings
Scroll down to the RSS Settings
Set both Minimum and Maximum number of items to 1

That will ensure you only send the latest RSS item as your newsletter.
Set "Use the item summary content (the description or summary element) instead of the content element" to "No". This will allow the full text of the RSS item to be sent.


Edit config.php

For some reason, you need to manually edit this file in a text editor, rather than a GUI.


Set define('USE_REPETITION', 1); - this allows the newsletter to be sent whenever there is a new RSS item.
Set define('CLICKTRACK', 0); - this removes tracking links from your emails. I don't care who opens my emails or what they click on.


Add The Campaign


Go to  Campaigns → Send a campaign.
Start a new campaign.


Tab 1


Campaign subject should be [RSSITEM:TITLE] - that will make the subject line the same as your post's title
Compose message should be [RSS] - that will ensure the contents come from your RSS feed.


Tab 2


Add your RSS feed's URl
Order items "Newest" first - to get the most recent item.
Add a custom HTML template. I used one from https://emailframe.work/



  [TITLE]
  
      
          
              [CONTENT]
          
      
  



Tab 3


Send as HTML


Tab 4


"Stop sending after" - choose the furthest date in the future possible.
"Repeat campaign every" - I chose "hour". That should check the RSS feed each hour.


Tab 5


"Lists" - pick the email list you want to send from.


Tab 6


You should be finished! It will tell you if there are any errors.
Place the campaign in the queue for processing.


WordPress Sign Up Form

You can either redirect users to your phpList subscription page, or put a form directly on your site.


    Email address:
    
    
    
    
    
        
    
    



Adjust the hidden parameters based on your list.

If in doubt, go to Config →  Subscribe pages, and generate a new subscribe page. Then copy the form from that.

Cron Jobs

You need two cron jobs set up.

Update the RSS feed

I run this every hour:

/usr/bin/php /path/to/YourSubscribePage/admin/index.php -p get -m RssFeedPlugin -c /path/to/YourSubscribePage/config/config.php

Process the Queue

I run this a few minutes after the RSS feed is updated

/usr/bin/php -q /path/to/YourSubscribePage/admin/index.php -p processqueue -c /path/to/YourSubscribePage/config/config.php >/dev/null

And then...

That should be it.  There are lots of options which you can fiddle around with. But the above should be enough to get your first newsletter out.

Huge thanks to Duncan Cameron for graciously answering my noddy questions and helping me out with the config.



WordPress - Display hook action priority in the dashboard
@edent — Sat, 31 Aug 2024 11:34:14 +0000
If your WordPress site has lots of plugins, it's sometimes difficult to keep track of what is manipulating your content. Ever wondered what priority all your various actions and filters have? This is a widget which will show you which actions are registered to your blog's hooks, and their priority order.

It looks like this:



Stick this code in your theme's functions.php or in its own plugin.

function edent_priority_dashboard_widget_contents() {
    global $wp_filter; 
    //  Change this to the hook you're interested in
    $hook_name = "the_content";
    if ( isset( $wp_filter[$hook_name] ) ) {

        //  Display the hook name in the widget
        echo "{$hook_name}";

        //  Start a list
        echo "";

        //  Loop through the callbacks in priority order
        foreach ( $wp_filter[$hook_name]->callbacks as $priority => $callbacks ) {
            echo "Priority: {$priority}";

            foreach ( $callbacks as $callback ) {
                //  Some callbacks are arrays
                if ( is_array( $callback["function"] ) ) {
                    if (is_object($callback["function"][0])) {
                        $callback_info = get_class($callback["function"][0]) . '::' . $callback["function"][1];
                    } else {
                        $callback_info = $callback["function"][0] . '::' . $callback["function"][1];
                    }
                } else {
                    $callback_info = $callback["function"];
                }
                //  Show the information
                echo "Callback: {$callback_info}";
            }
            echo "";
        }
        echo '';

    } 
    else {
        echo "No filters found for hook: {$hook_name}";
    }

    //  Scrap of CSS to ensure list items display properly on the dashboard
    $priority_css_code = "#edent_dashboard_widget ul { list-style: circle; padding: 1em; }";
    //  Inline the CSS
    echo "";

}

//  Register the widget with the admin dashboard
function edent_register_dashboard_widget() {
    wp_add_dashboard_widget(
        "edent_dashboard_widget",   //  ID of the widget
        "Priorities",   //  Title of the widget
        "edent_priority_dashboard_widget_contents"  //  Function to run
    );
}
add_action( "wp_dashboard_setup", "edent_register_dashboard_widget" );


Why?

WordPress lets you add actions and filters to hooks.  For example, whenever your blog wants to show some content, a hook of the_content is run.

You can add an action to run a function when that happens. For example, if you want to make all the text in your blog posts uppercase, you could add this to your theme or plugin:

function lower_case_everything( $content ) {
   return strtolower( $content );
}
add_filter( 'the_content', 'lower_case_everything', 99 );


The add_filter says "When the hook called the_content is fired, run the function lower_case_everything, with a priority of 99".  The lower the number, the sooner the function is run.

php – Terence Eden’s Blog

PHP - simple way to send HTTP headers before a script ends

Vertically Aligning Roman Numerals in Code

You can parse an .env file as an .ini with PHP - but there's a catch

Some updates to ActivityBot

A big list of things I disable in WordPress

A Self-Hosted Favicon Proxy written in PHP

Stop using preg_* on HTML and start using \Dom\HTMLDocument instead

Test

Hello

Hello friend

Goodbyefriend

Hello

Hello

Title

Title

Hello

Hello

Title

Using Tempest Highlight with WordPress

A small PHP update to GeSHi

Introducing Pretty Print HTML for PHP 8.4

Title

Title

An opinionated HTML Serializer for PHP 8.4

Testing

Testing

Testing

Pretty Print HTML using PHP 8.4's new HTML DOM

Testing

Testing

Testing

Create a Table of Contents based on HTML Heading Elements

is a sub-sub-heading, and so on. Together, they form a hierarchy.

The Theory of Everything

Experiments

First attempt

Second attempt

Equipment

Broken equipment

Repaired equipment

Working Equipment

Table of Contents

Table of Contents

Text

, you can get rid of some of the code dealing with that edge case.

`Goodbyefriend`

`Title`

is a sub-sub-heading, and so on.

Together, they form a hierarchy.