link rel="alternate" type="text/plain"


Hot on the heels of yesterday's post, I've now made all of this blog available in text-only mode.

Simply append .txt to the URl of any page and you'll get back the contents in plain UTF-8 text. No formatting, no images (although you can see the alt text), no nothing!

This was slightly tricky to get right! While there might be an easier way to do it, here's how I got it to work.

Firstly, when someone requests /whatever.txt, WordPress is going to 404 - because that page doesn't exist. So, my theme's functions.php, detects any URls which end in .txt and redirects it to a different template.

//  Theme Switcher
add_filter( "template_include", "custom_theme_switch" );
function custom_theme_switch( $template ) {

    //  What was requested?
    $requested_url = $_SERVER["REQUEST_URI"];

    //  Check if the URL ends with .txt
    if ( substr( $requested_url, -4 ) === ".txt")  {    
        //  Get the path to the custom template
        $custom_template = get_template_directory() . "/templates/txt-template.php";
        //  Check if the custom template exists
        if ( file_exists( $custom_template ) ) {
            return $custom_template;
        }
    }

    //  Return the default template
    return $template;
}

The txt-template.php file is more complex. It takes the requested URl, strips off the .txt, matches it against the WordPress rewrite rules, and then constructs the WP_Query which would have been run if the .txt wasn't there.

//  Run the query for the URl requested
$requested_url = $_SERVER['REQUEST_URI'];    // This will be /whatever
$blog_details = wp_parse_url( home_url() );  // Get the blog's domain to construct a full URl
$query = get_query_for_url(
    $blog_details["scheme"] . "://" . $blog_details["host"] . substr( $requested_url, 0, -4 )
);

function get_query_for_url( $url ) {
    //  Get all the rewrite rules
    global $wp_rewrite;

    //  Get the WordPress site URL path
    $site_path = parse_url( get_site_url(), PHP_URL_PATH ) . "/";

    //  Parse the requested URL
    $url_parts = parse_url( $url );

    //  Remove the domain and site path from the URL
    //  For example, change `https://example.com/blog/2024/04/test` to just `2024/04/test`
    $url_path = isset( $url_parts['path'] ) ? str_replace( $site_path, '', $url_parts['path'] ) : '';

    //  Match the URL against WordPress rewrite rules
    $rewrite_rules = $wp_rewrite->wp_rewrite_rules();
    $matched_rule = false;

    foreach ( $rewrite_rules as $pattern => $query ) {
        if ( preg_match( "#^$pattern#", $url_path, $matches ) ) {
            $matched_rule = $query;
            break;
        }
    }

    //  Replace each occurrence of $matches[N] with the corresponding value
    foreach ( $matches as $key => $value ) {
        $matched_rule = str_replace( "\$matches[{$key}]", $value, $matched_rule );
    }

    //  Turn the query string into a WordPress query
    $query_params = array();
    parse_str(
        parse_url( $matched_rule, PHP_URL_QUERY),
        $query_params
    );

    //  Construct a new WP_Query object using the extracted query parameters
    $query = new WP_Query($query_params);

    //  Return the result of the query
    return $query;
}

From there, it's a case of iterating over the posts returned by the query. You can see the full code on my GitLab.


Share this post on…

3 thoughts on “link rel="alternate" type="text/plain"”

  1. I love the text version. It's a quick way to get an idea of what the blog post looks like for someone who is using a screen reader - or when a bot crawls the blog for the purposes of (ahem) stealing it to populate a large language model (ahem), or for good old indexing in a search engine.

    Reply

What links here from around this blog?

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">