link rel="alternate" type="text/plain"


Hot on the heels of yesterday's post, I've now made all of this blog available in text-only mode.

Simply append .txt to the URl of any page and you'll get back the contents in plain UTF-8 text. No formatting, no images (although you can see the alt text), no nothing!

This was slightly tricky to get right! While there might be an easier way to do it, here's how I got it to work.

Firstly, when someone requests /whatever.txt, WordPress is going to 404 - because that page doesn't exist. So, my theme's functions.php, detects any URls which end in .txt and redirects it to a different template.

 PHP//  Theme Switcher
add_filter( "template_include", "custom_theme_switch" );
function custom_theme_switch( $template ) {

    //  What was requested?
    $requested_url = $_SERVER["REQUEST_URI"];

    //  Check if the URL ends with .txt
    if ( substr( $requested_url, -4 ) === ".txt")  {  
        //  Get the path to the custom template
        $custom_template = get_template_directory() . "/templates/txt-template.php";
        //  Check if the custom template exists
        if ( file_exists( $custom_template ) ) {
            return $custom_template;
        }
    }

    //  Return the default template
    return $template;
}

The txt-template.php file is more complex. It takes the requested URl, strips off the .txt, matches it against the WordPress rewrite rules, and then constructs the WP_Query which would have been run if the .txt wasn't there.

 PHP//  Run the query for the URl requested
$requested_url = $_SERVER['REQUEST_URI'];    // This will be /whatever
$blog_details = wp_parse_url( home_url() );  // Get the blog's domain to construct a full URl
$query = get_query_for_url(
    $blog_details["scheme"] . "://" . $blog_details["host"] . substr( $requested_url, 0, -4 )
);

function get_query_for_url( $url ) {
    //  Get all the rewrite rules
    global $wp_rewrite;

    //  Get the WordPress site URL path
    $site_path = parse_url( get_site_url(), PHP_URL_PATH ) . "/";

    //  Parse the requested URL
    $url_parts = parse_url( $url );

    //  Remove the domain and site path from the URL
    //  For example, change `https://example.com/blog/2024/04/test` to just `2024/04/test`
    $url_path = isset( $url_parts['path'] ) ? str_replace( $site_path, '', $url_parts['path'] ) : '';

    //  Match the URL against WordPress rewrite rules
    $rewrite_rules = $wp_rewrite->wp_rewrite_rules();
    $matched_rule = false;

    foreach ( $rewrite_rules as $pattern => $query ) {
        if ( preg_match( "#^$pattern#", $url_path, $matches ) ) {
            $matched_rule = $query;
            break;
        }
    }

    //  Replace each occurrence of $matches[N] with the corresponding value
    foreach ( $matches as $key => $value ) {
        $matched_rule = str_replace( "\$matches[{$key}]", $value, $matched_rule );
    }

    //  Turn the query string into a WordPress query
    $query_params = array();
    parse_str(
        parse_url( $matched_rule, PHP_URL_QUERY),
        $query_params
    );

    //  Construct a new WP_Query object using the extracted query parameters
    $query = new WP_Query($query_params);

    //  Return the result of the query
    return $query;
}

From there, it's a case of iterating over the posts returned by the query. You can see the full code on my GitLab.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

3 thoughts on “link rel="alternate" type="text/plain"”

  1. I love the text version. It's a quick way to get an idea of what the blog post looks like for someone who is using a screen reader - or when a bot crawls the blog for the purposes of (ahem) stealing it to populate a large language model (ahem), or for good old indexing in a search engine.
    Reply

What links here from around this blog?

  1. The Logo for WordPress. link rel="alternate" type="text/plain"

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">