Import JetPack Statistics into Koko
I've quit JetPack stats. I've moved to Koko Analytics. All the stats code is self hosted, it is privacy preserving, and the codebase is small and simple.
But I am vain. I want all my old JetPack stats to appear in Koko so I can look back on the glory days of blogging.
Koko has two main tables. The first is a summary table called wpbp_koko_analytics_site_stats
:
date | visitors | pageviews |
---|---|---|
2009-10-30 | 25 | 47 |
2009-10-31 | 70 | 86 |
2009-11-01 | 61 | 72 |
That's the total number of visitors and page views for each date.
Then there's a more detailed breakdown at wpbp_koko_analytics_post_stats
date | id | visitors | pageviews |
---|---|---|---|
2009-10-30 | 123 | 2 | 2 |
2009-10-30 | 456 | 5 | 6 |
2009-10-30 | 789 | 1 | 1 |
That shows every individual post's number of views and visitors per day.
WordPress's database is MySQL. It can handle CSV imports. So, given these tables are pretty simple, it is possible to get the old WordPress stats, convert them to CSV, and then import them.
Let's go!
As described in a previous post, the JetPack stats API is fairly basic. It doesn't differentiate between visitors and pageviews. So, for this import, we'll pretend they're the same.
This will be a 4 step process.
1. Get all your stats in JSON format
This code loops through your stats and downloads a JSON file for each one.
You will need:
- Your API key - find it at https://apikey.wordpress.com/
- Your blog's web address - I assume you know this
- The earliest date you have for JetPack - you will need to find this yourself
Python 3import requests
import datetime
import os
import json
# Directory to save the JSON files
save_dir = "jetpack_stats"
os.makedirs(save_dir, exist_ok=True)
# URL of the API
base_url = "https://stats.wordpress.com/csv.php?api_key=123456789012"+\
"&blog_uri=https://example.com/"+\
"&table=postviews"+\
"&days=1"+\
"&format=json"+\
"&limit=-1"+\
"&end="
# Make API call and save the response
def fetch_and_save_json(date):
# Format the date as ISO8601 (YYYY-MM-DD)
formatted_date = date.isoformat()
# Make the API call
url = f"{base_url}{formatted_date}"
response = requests.get(url)
if response.status_code == 200:
data = response.json()
file_name = f"{formatted_date}.json"
file_path = os.path.join(save_dir, file_name)
with open(file_path, "w") as f:
json.dump(data, f, indent=4)
print(f"Saved {formatted_date}")
else:
print(f"Failed! {formatted_date} status code: {response.status_code}")
# Iterate over a date range
start_date = datetime.date(2020, 1 , 1)
end_date = datetime.date(2024, 10, 30)
# Loop through all dates
current_date = start_date
while current_date <= end_date:
fetch_and_save_json(current_date)
current_date += datetime.timedelta(days=1)
2. Generate The Individual Posts' Stats
This takes those JSON files and creates a single CSV ready to upload to wpbp_koko_analytics_post_stats
.
Python 3import os
import json
import csv
# Directory where the JSON files are stored
json_dir = "jetpack_stats"
# List to hold the loaded data
all_data = []
json_files = sorted( [ file for file in os.listdir( json_dir ) if file.endswith( ".json" ) ] )
# Loop through all files in the directory
for file_name in json_files :
if file_name.endswith( ".json" ) :
file_path = os.path.join(json_dir, file_name)
with open(file_path, "r") as json_file :
data = json.load(json_file)
all_data.append(data) # Add the data to the list
print(f"Total files loaded: {len(all_data)}")
with open( "wpbp_koko_analytics_post_stats.csv", "w", newline="") as csvfile:
csvwriter = csv.writer( csvfile, delimiter="," )
for stat in all_data:
for views in stat[0]["postviews"] :
csvwriter.writerow( [ stat[0]["date"], views["post_id"], views["views"], views["views"] ] )
3. Generate the total site views
It is possible to get this separately from the JetPack API using &table=views
- but that's a lot more API calls. So we're just going to sum it up instead 😄
This, again, inserts a dummy value for visitors.
Python 3import pandas as pd
input_csv = "wpbp_koko_analytics_post_stats.csv"
output_csv = "wpbp_koko_analytics_site_stats.csv"
column_names = ['Date', 'Post ID', 'Visitors', 'Page Views']
df = pd.read_csv( input_csv, names=column_names )
# Group by Date and sum the Page Views
df_grouped = df.groupby( "Date" )["Page Views"].sum().reset_index()
# Add a new column with a copy of the Page Views
df_grouped['Visitors'] = df_grouped['Page Views']
# Rename the Page Views column to Total Page Views
df_grouped = df_grouped.rename(columns={"Page Views": "Total Page Views"})
# Write the dataframe to the output CSV file
df_grouped.to_csv( output_csv, index=False, header=False )
4. Upload to MySQL
You're on your own here, Sparky. If you have something like PHPMyAdmin, you should be able to load the file in directly. Anything else… good luck!
Once done, your stats dashboard should be filled with historic data.
Huge thanks to Koko Analytics for providing such a great tool and answering my questions.
Danny van Kooten said on toot.re:
@Edent So cool that that worked! Great write-up, I'm definitely going to be referring to it once I get to turning this into a feature.
Jan van den Berg says:
Great! I did not know about Koko before, will definitely look into it. See how accurate it is.
I've been using 3 plugins: Jetpack, Post Views Counter and WordPress Popular Posts side by side for a while now, and they all show different stats (which is strange and I can't really explain).
quantixed said on fosstodon.org:
@Edent Thank you for writing this. I just did the same for my site. Worked like a treat*. Bye bye Jetpack.
* used blog_id rather than blog_uri (which I couldn't get to work) for API call
Jan van den Berg says:
I followed your scripts, and they worked like a charm! Many thanks. Follow-up question: would it also be possible to get the referrers history from the api?
@edent says:
Change the
table
frompostviews
to one of eitherreferrers
orreferrers_grouped
That should work.
Neil Turner said on neilturner.me.uk:
Back in March, I stopped using the Jetpack WordPress plugin, and replaced it with Toolbelt, which replicates many of Jetpack’s features. I’ve been concerned about…
More comments on Mastodon.