I have a trick I use to tidy my WordPress-generated HTML to exactly the specification I want. It's really, really hacky, which is a big part of why I haven't blogged about it, but it works (probably won't work on a FSE theme, though, if you're using one of those)!
Here's the super-skinny:
In my
header.php
, before any HTML is generated, I start an output buffer with a callback function: e.g. ob_start('tidy_up_entire_page');
This buffer wraps all the others, so it gets called/flushed when the last bit of content is generated.
That function does a few safety checks (to allow me to bypass it) before running the buffer contents through HTMLTidy with my custom ruleset, including e.g. 'output-html' => true
to force HTML5. I use ->cleanRepair()
so it also attempts to do smart fixing of any accidentally-broken HTML I've produced.
Ensure the output is produced before Super Cache or whatever kicks in, so you don't have to run HTMLTidy on 99% of the output (just serve the cached copy).
It's really ugly. But I'll tell you what's not ugly: my pretty-printed source! Take a look and see what you think!