link rel="alternate" type="text/plain"
Hot on the heels of yesterday's post, I've now made all of this blog available in text-only mode.
Simply append .txt
to the URl of any page and you'll get back the contents in plain UTF-8 text. No formatting, no images (although you can see the alt text), no nothing!
- Front page https://shkspr.mobi/blog/.txt
- This blog post https://shkspr.mobi/blog/2024/05/link-relalternate-typetext-plain/.txt
- A tag https://shkspr.mobi/blog/tag/solar.txt
This was slightly tricky to get right! While there might be an easier way to do it, here's how I got it to work.
Firstly, when someone requests /whatever.txt
, WordPress is going to 404 - because that page doesn't exist. So, my theme's functions.php
, detects any URls which end in .txt
and redirects it to a different template.
PHP
// Theme Switcher add_filter( "template_include", "custom_theme_switch" ); function custom_theme_switch( $template ) { // What was requested? $requested_url = $_SERVER["REQUEST_URI"]; // Check if the URL ends with .txt if ( substr( $requested_url, -4 ) === ".txt") { // Get the path to the custom template $custom_template = get_template_directory() . "/templates/txt-template.php"; // Check if the custom template exists if ( file_exists( $custom_template ) ) { return $custom_template; } } // Return the default template return $template; }
The txt-template.php
file is more complex. It takes the requested URl, strips off the .txt
, matches it against the WordPress rewrite rules, and then constructs the WP_Query
which would have been run if the .txt
wasn't there.
PHP
// Run the query for the URl requested $requested_url = $_SERVER['REQUEST_URI']; // This will be /whatever $blog_details = wp_parse_url( home_url() ); // Get the blog's domain to construct a full URl $query = get_query_for_url( $blog_details["scheme"] . "://" . $blog_details["host"] . substr( $requested_url, 0, -4 ) ); function get_query_for_url( $url ) { // Get all the rewrite rules global $wp_rewrite; // Get the WordPress site URL path $site_path = parse_url( get_site_url(), PHP_URL_PATH ) . "/"; // Parse the requested URL $url_parts = parse_url( $url ); // Remove the domain and site path from the URL // For example, change `https://example.com/blog/2024/04/test` to just `2024/04/test` $url_path = isset( $url_parts['path'] ) ? str_replace( $site_path, '', $url_parts['path'] ) : ''; // Match the URL against WordPress rewrite rules $rewrite_rules = $wp_rewrite->wp_rewrite_rules(); $matched_rule = false; foreach ( $rewrite_rules as $pattern => $query ) { if ( preg_match( "#^$pattern#", $url_path, $matches ) ) { $matched_rule = $query; break; } } // Replace each occurrence of $matches[N] with the corresponding value foreach ( $matches as $key => $value ) { $matched_rule = str_replace( "\$matches[{$key}]", $value, $matched_rule ); } // Turn the query string into a WordPress query $query_params = array(); parse_str( parse_url( $matched_rule, PHP_URL_QUERY), $query_params ); // Construct a new WP_Query object using the extracted query parameters $query = new WP_Query($query_params); // Return the result of the query return $query; }
From there, it's a case of iterating over the posts returned by the query. You can see the full code on my GitLab.
I love the text version. It's a quick way to get an idea of what the blog post looks like for someone who is using a screen reader - or when a bot crawls the blog for the purposes of (ahem) stealing it to populate a large language model (ahem), or for good old indexing in a search engine.
I didnt realise I had CAPS LOCK on until I typed /.TXT andit 404d FYI
@edent says:
PEBCAK 😆
More comments on Mastodon.