Working around an old and buggy HTML Tidy in PHP
Dan Q very kindly shared his script to make WordPress do good HTML. But I couldn't get it working.
Looking at the HTML it was spitting out, the meta generator said it was HTML Tidy version 5.6.0. That's quite old! I confirmed this by running:
PHPecho tidy_get_release();
Which spat out 2017/11/25
. Aha!
There are a few bugs in this version of HTML Tidy, some of which are fixed in later versions.
Here's how to fix them.
Auto Indent doesn't work. This is fixed by manually specifying "indent" => 2
Indent with tabs doesn't work. So I told it to indent with 8 spaces using "indent-spaces" => 8,
Then I used a regex (naughty!) to replace 8 spaces with a tab.
PHP$tidy = preg_replace( '/ /', "\t", $tidy );
Older versions of Tidy don't support newer HTML elements like <search>
. This can be fixed with "new-blocklevel-tags" => "search",
The <summary>
element isn't closed properly. This was an annoying one. I had to manually rewrite my HTML to remove an <h2>
element from inside the summary.
Although not really a bug, I like to have HTML comments on a newline.
PHP$tidy = preg_replace( '/><!--/', ">\n<!--", $tidy );
Sadly, the last release of HTML Tidy was back in 2021. While some of the above bugs are fixed, there are more piling up.
So I'll continue with these workarounds for now. Hit "view source" and tell me what you think!