Graphing the connections between my blog posts


I love ripping off good ideas from other people's blogs. I was reading Alvaro Graves-Fuenzalida's blog when I saw this nifty little force-directed graph:

A graph of interconnected nodes.

When zoomed in, it shows the relation between posts and tags.

Text labels on the nodes show that the two of the posts share a common tag.

In this case, I can see that the posts about Small Gods and Pyramids both share the tags of Discworld, Fantasy, and Book Review. But only Small Gods has the tag of Religion.

Isn't that cool! It is a native feature of Quartz's GraphView. How can I build something like that for my WordPress blog?

Aim

Create an interactive graph which shows the relationship between a post, its links, and their tags.

It will end up looking something like this:

A force directed graph showing how four different posts link to each other and how their hashtags relate.

You can get the code or follow along to see how it works.

This is a multi-stage process. Let's begin!

What We Need

When on a single Post, we need the following:

  • The tags assigned to that Post.
  • Internal links back to that Post.
  • Internal links from that Post.
  • The tags assigned to links to and from that Post.

Tags assigned to that Post.

This is pretty easy! Using the get_the_tag_list() function we can, unsurprisingly, get all the tags associated with a post.

PHP PHP$post_tags_text = get_the_tag_list( "", ",", $ID );
$post_tags_array = explode( "," , $post_tags_text );

That just gets the list of tag names. If we want the tag IDs as well, we need to use the get_the_tags() function.

PHP PHP$post_tags = get_the_tags($ID);
$tags = array();
foreach($post_tags as $tag) {
    $tags[$tag->term_id] = $tag->name;
}

Internal links back to the Post is slightly trickier. WordPress doesn't save relational information like that. Instead, we get the Post's URl and search for that in the database. Then we get the post IDs of all the posts which contain that string.

PHP PHP//  Get all the posts which link to this one, oldest first
$the_query = new WP_Query(
    array(
        's' => $search_url,
        'post_type' => 'post',
        "posts_per_page" => "-1",
        "order" => "ASC"
    )
);

//  Nothing to do if there are no inbound links
if ( !$the_query->have_posts() ) {
    return;
}

Once we have an array of posts which link back here, we can get their tags as above:

PHP PHP//  Loop through the posts
while ( $the_query->have_posts() ) {
    //  Set it up
    $the_query->the_post();
    $id                  = get_the_ID();
    $title               = esc_html( get_the_title() );
    $url                 = get_the_permalink();
    $backlink_tags_text  = get_the_tag_list( "", ",", $ID );
    $backlink_tags_array = explode( "," , $backlink_tags_text );
}

Links from the Post

Again, WordPress's lack of relational links is a weakness. In order to get internal links, we need to:

  1. Render the HTML using all the filters
  2. Search for all <a href="…">
  3. Extract the ones which start with the blog's domain
  4. Get those posts' IDs.

Rendering the content into HTML is done with:

PHP PHP$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );

Searching for links is slightly more complex. The easiest way is to load the HTML into a DOMDocument, then extract all the anchors. All my blog posts start /blog/YYYY so I can avoid selecting links to tags, uploaded files, or other things. Your blog may be different.

PHP PHP$dom = new DOMDocument();
libxml_use_internal_errors( true ); //  Suppress warnings from malformed HTML
$dom->loadHTML( $content );
libxml_clear_errors();

$links = [];
foreach ( $dom->getElementsByTagName( "a" ) as $anchor ) {
    $href = $anchor->getAttribute( "href" );
    if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) {
        $links[] = $href;
    }
}

The ID of each post can be found with the url_to_postid() function. That means we can re-use the earlier code to see what tags those posts have.

Building a graph

OK, so we have all our constituent parts. Let's build a graph!

Graphs consist of nodes (posts and tags) and edges (links between them). The exact format of the graph is going to depend on the graph library we use.

I've decided to use D3.js's Force Graph as it is relatively simple and produces a reasonably good looking interactive SVG.

Imagine there are two blog posts and two hashtags.

JavaScript JavaScriptconst nodes = [
    { id: 1, label: "Blog Post 1",    url: "https://example.com/post/1", group: "post" },
    { id: 2, label: "Blog Post 2",    url: "https://example.com/post/2", group: "post" },
    { id: 3, label: "hashtag",        url: "https://example.com/tag/3",  group: "tag"  },
    { id: 4, label: "anotherHashtag", url: "https://example.com/tag/4",  group: "tag"  },
];
  • Blog Post 1 links to Blog Post 2.
  • Blog Post 1 has a #hashtag.
  • Both 1 & 2 share #anotherHashtag.
JavaScript JavaScriptconst links = [
    { source: 1, target: 2 },
    { source: 3, target: 1 },
    { source: 4, target: 1 },
    { source: 4, target: 2 },
];

Here's how to create a list of nodes and their links. You will need to edit it for your own blog's peculiarities.

PHP PHP<?php 
// Load WordPress environment
require_once( "wp-load.php" );

//  Set up arrays for nodes and links
$nodes = array();
$links = array();

//  ID of the Post
$main_post_id = 12345;

//  Get the Post's details
$main_post_url   = get_permalink( $main_post_id );
$main_post_title = get_the_title( $main_post_id );

//  Function to add new nodes
function add_item_to_nodes( &$nodes, $id, $label, $url, $group ) {
    $nodes[] = [
        "id"    => $id,
        "label" => $label,
        "url"   => $url,
        "group" => $group
    ];

}

//  Function to add new relationships
function add_relationship( &$links, $source, $target ) {
    $links[] = [
        "source" => $source,
        "target" => $target
    ];
}

//  Add Post to the nodes
add_item_to_nodes( $nodes, $main_post_id, $main_post_title, $main_post_url, "post" );

//  Get the tags of the Post
$main_post_tags = get_the_tags( $main_post_id );

//  Add the tags as nodes, and create links to main Post
foreach( $main_post_tags as $tag ) {
    $id   = $tag->term_id;
    $name = $tag->name;

    //  Add the node
    add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
    //  Add the relationship
    add_relationship( $links, $id, $main_post_id );
}

//  Get all the posts which link to this one, oldest first
$the_query = new WP_Query(
    array(
        's'              => $main_post_url,
        'post_type'      => 'post',
        "posts_per_page" => "-1",
        "order"          => "ASC"
    )
);

//  Nothing to do if there are no inbound links
if ( $the_query->have_posts() ) {
    //  Loop through the posts
    while ( $the_query->have_posts() ) {
        //  Set up the query
        $the_query->the_post();
        $post_id = get_the_ID();
        $title = esc_html( get_the_title() );
        $url   = get_the_permalink();

        //  Add the node
        add_item_to_nodes( $nodes, $post_id, $title, $url, "post" );
        //  Add the relationship
        add_relationship( $links, $post_id, $main_post_id );

        //  Get the tags of the Post
        $post_tags = get_the_tags( $post_id );

        //  Add the tags as nodes, and create links to main Post
        foreach($post_tags as $tag) {

            $id   = $tag->term_id;
            $name = $tag->name;

            //  Add the node
            add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
            //  Add the relationship
            add_relationship( $links, $id, $post_id );
        }

    }
}

//  Get all the internal links from this post
//  Render the post as HTML
$content = apply_filters( "the_content", get_the_content( null, false, $ID ) );

//  Load it into HTML
$dom = new DOMDocument();
libxml_use_internal_errors( true );
$dom->loadHTML( $content );
libxml_clear_errors();

//  Get any <a href="…" which starts with https://shkspr.mobi/blog/
$internal_links = [];
foreach ( $dom->getElementsByTagName( "a" ) as $anchor ) {
    $href = $anchor->getAttribute( "href" );
    if (preg_match('/^https:\/\/shkspr\.mobi\/blog\/\d{4}$/', $href)) {
        $internal_links[] = $href;
    }
}

//  Loop through the internal links, get their hashtags
foreach ( $internal_links as $url ) {
    $post_id = url_to_postid( $url );
    //  Get the Post's details
    $post_title = get_the_title( $id );

    //  Add the node
    add_item_to_nodes( $nodes, $post_id, $post_title, $url, "post" );
    //  Add the relationship
    add_relationship($links, $main_post_id, $post_id );

    //  Get the tags of the Post
    $post_tags = get_the_tags( $post_id );

    //  Add the tags as nodes, and create links to main Post
    foreach( $post_tags as $tag ) {
        $id   = $tag->term_id;
        $name = $tag->name;

        //  Add the node
        add_item_to_nodes( $nodes, $id, $name, "https://shkspr.mobi/blog/tag/" . $name, "tag" );
        //  Add the relationship
        add_relationship( $links, $id, $post_id );
    }
}

//  Deduplicate the nodes and links
$nodes_unique = array_unique( $nodes, SORT_REGULAR );
$links_unique = array_unique( $links, SORT_REGULAR );

//  Put them in the keyless format that D3 expects
$nodes_output = array();
$links_output = array();

foreach ( $nodes_unique as $node ) {
    $nodes_output[] = $node;
}

foreach ( $links_unique as $link ) {
    $links_output[] = $link;
}

//  Return the JSON
echo json_encode( $nodes_output, JSON_PRETTY_PRINT );
echo "\n";
echo json_encode( $links_output, JSON_PRETTY_PRINT );

Creating a Force Directed SVG

Once the data are spat out, you can include them in a web-page. Here's a basic example:

HTML HTML<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Force Directed Graph</title>
        <script src="https://d3js.org/d3.v7.min.js"></script>
    </head>
    <body>
        <svg width="800" height="600">
            <defs>
                <marker id="arrowhead" markerWidth="10" markerHeight="7" refX="10" refY="3.5" orient="auto" fill="#999">
                <path d="M0,0 L10,3.5 L0,7 Z"></path>
                </marker>
            </defs>
        </svg>
        <script>
JavaScript JavaScript            const nodes = [];
            const links = [];

            const width  = 800;
            const height = 600;

            const svg = d3.select("svg")
                .attr( "width",  width  )
                .attr( "height", height );

            const simulation = d3.forceSimulation( nodes )
                .force( "link",   d3.forceLink( links ).id( d => d.id ).distance( 100 ) )
                .force( "charge", d3.forceManyBody().strength( -300 ) )
                .force( "center", d3.forceCenter( width / 2, height / 2 ) );

            //  Run simulation with simple animation
            simulation.on("tick", () => {
                link
                    .attr("x1", d => d.source.x)
                    .attr("y1", d => d.source.y)
                    .attr("x2", d => d.target.x)
                    .attr("y2", d => d.target.y);   node
                    .attr("transform", d => `translate(${d.x},${d.y})`);
            });

            // Draw links
            const link = svg.selectAll( ".link" )
                .data(links)
                .enter().append("line")
                .attr( "stroke", "#999" )
                .attr( "stroke-width", 2 )
                .attr( "x1", d => d.source.x )
                .attr( "y1", d => d.source.y )
                .attr( "x2", d => d.target.x )
                .attr( "y2", d => d.target.y )
                .attr( "marker-end", "url(#arrowhead)" );

            //  Draw nodes
            const node = svg.selectAll( ".node" )
                .data( nodes )
                .enter().append( "g" )
                .attr( "class", "node" )
                .attr( "transform", d => `translate(${d.x},${d.y})` )
                .call(d3.drag() //  Make nodes draggable
                    .on( "start", dragStarted )
                    .on( "drag",  dragged )
                    .on( "end",   dragEnded )
                );

            //  Add hyperlink
            node.append("a")
            .attr( "xlink:href", d => d.url ) //    Link to the node's URL
            .attr( "target", "_blank" ) //  Open in a new tab
            .each(function (d) {
                const a = d3.select(this);
                //  Different shapes for posts and tags
                if ( d.group === "post" ) {
                    a.append("circle")
                        .attr("r", 10)
                        .attr("fill", "blue");
                } else if ( d.group === "tag" ) {
                    //  White background rectangle
                    a.append("rect")
                            .attr("width", 20)
                            .attr("height", 20)
                            .attr("x", -10)
                            .attr("y", -10)
                            .attr("fill", "white");
                    // Red octothorpe
                    a.append("path")
                            .attr("d", "M-10,-5 H10 M-10,5 H10 M-5,-10 V10 M5,-10 V10")
                            .attr("stroke", "red")
                            .attr("stroke-width", 2)
                            .attr("fill", "none");
                }
                //  Text label
                a.append( "text")
                    .attr( "dy", 4 )
                    .attr( "x", d => ( d.group === "post" ? 12 : 14 ) )
                    .attr( "fill", "black" )
                    .style("font-size", "12px" )
                    .text( d.label );
            });

            //  Standard helper functions to make nodes draggable
            function dragStarted( event, d ) {
                if ( !event.active ) simulation.alphaTarget(0.3).restart();
                d.fx = d.x;
                d.fy = d.y;
            }
            function dragged( event, d ) {
                d.fx = event.x;
                d.fy = event.y;
            }
            function dragEnded( event, d ) {
                if (!event.active) simulation.alphaTarget(0);
                d.fx = null;
                d.fy = null;
            }
HTML HTML        </script>
    </body>
</html>

Next Steps

It needs a bit of cleaning up if I want to turn it into a WordPress plugin. It might be nice to make it a static SVG rather than relying on JavaScript. And the general æsthetic needs a bit of work.

Perhaps I could make it 3D like my MSc Dissertation?

But I'm pretty happy with that for an afternoon hack!

You can get the code if you want to play.


Share this post on…

  • Mastodon
  • Facebook
  • LinkedIn
  • BlueSky
  • Threads
  • Reddit
  • HackerNews
  • Lobsters
  • WhatsApp
  • Telegram

One thought on “Graphing the connections between my blog posts”

  1. said on mastodon.xyz:

    @Edent Awesome. I'm been thinking to expand my "thinking in network" about things by adding some sort of graph view of relations between posts on my digital garden https://julianoe.eu.org/tags/pens%c3%a9e-en-r%c3%a9seau/

    I tried to replicate something similar to what @elly did ellyloel.com/garden/

    Not had the time to implement it for now.

    Your work will help me greatly I think!

    The local software @zettlr has a graph view feature included and I find it so cool

    Carnet - Julianoë

    Reply | Reply to original comment on mastodon.xyz

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.

Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <p> <pre> <br> <img src="" alt="" title="" srcset="">