link rel="alternate" type="text/plain"
Hot on the heels of yesterday's post, I've now made all of this blog available in text-only mode.
Simply append .txt
to the URl of any page and you'll get back the contents in plain UTF-8 text. No formatting, no images (although you can see the alt text), no nothing!
- Front page https://shkspr.mobi/blog/.txt
- This blog post https://shkspr.mobi/blog/2024/05/link-relalternate-typetext-plain/.txt
- A tag https://shkspr.mobi/blog/tag/solar.txt
This was slightly tricky to get right! While there might be an easier way to do it, here's how I got it to work.
Firstly, when someone requests /whatever.txt
, WordPress is going to 404 - because that page doesn't exist. So, my theme's functions.php
, detects any URls which end in .txt
and redirects it to a different template.
PHP// Theme Switcher
add_filter( "template_include", "custom_theme_switch" );
function custom_theme_switch( $template ) {
// What was requested?
$requested_url = $_SERVER["REQUEST_URI"];
// Check if the URL ends with .txt
if ( substr( $requested_url, -4 ) === ".txt") {
// Get the path to the custom template
$custom_template = get_template_directory() . "/templates/txt-template.php";
// Check if the custom template exists
if ( file_exists( $custom_template ) ) {
return $custom_template;
}
}
// Return the default template
return $template;
}
The txt-template.php
file is more complex. It takes the requested URl, strips off the .txt
, matches it against the WordPress rewrite rules, and then constructs the WP_Query
which would have been run if the .txt
wasn't there.
PHP// Run the query for the URl requested
$requested_url = $_SERVER['REQUEST_URI']; // This will be /whatever
$blog_details = wp_parse_url( home_url() ); // Get the blog's domain to construct a full URl
$query = get_query_for_url(
$blog_details["scheme"] . "://" . $blog_details["host"] . substr( $requested_url, 0, -4 )
);
function get_query_for_url( $url ) {
// Get all the rewrite rules
global $wp_rewrite;
// Get the WordPress site URL path
$site_path = parse_url( get_site_url(), PHP_URL_PATH ) . "/";
// Parse the requested URL
$url_parts = parse_url( $url );
// Remove the domain and site path from the URL
// For example, change `https://example.com/blog/2024/04/test` to just `2024/04/test`
$url_path = isset( $url_parts['path'] ) ? str_replace( $site_path, '', $url_parts['path'] ) : '';
// Match the URL against WordPress rewrite rules
$rewrite_rules = $wp_rewrite->wp_rewrite_rules();
$matched_rule = false;
foreach ( $rewrite_rules as $pattern => $query ) {
if ( preg_match( "#^$pattern#", $url_path, $matches ) ) {
$matched_rule = $query;
break;
}
}
// Replace each occurrence of $matches[N] with the corresponding value
foreach ( $matches as $key => $value ) {
$matched_rule = str_replace( "\$matches[{$key}]", $value, $matched_rule );
}
// Turn the query string into a WordPress query
$query_params = array();
parse_str(
parse_url( $matched_rule, PHP_URL_QUERY),
$query_params
);
// Construct a new WP_Query object using the extracted query parameters
$query = new WP_Query($query_params);
// Return the result of the query
return $query;
}
From there, it's a case of iterating over the posts returned by the query. You can see the full code on my GitLab.
Caroline Jarrett says:
I love the text version. It's a quick way to get an idea of what the blog post looks like for someone who is using a screen reader - or when a bot crawls the blog for the purposes of (ahem) stealing it to populate a large language model (ahem), or for good old indexing in a search engine.
Dan Kendall says:
I didnt realise I had CAPS LOCK on until I typed /.TXT andit 404d FYI
@edent says:
PEBCAK 😆
More comments on Mastodon.