Getting Started with Mastodon's Conversations API
The social network service "Mastodon" allows people to publish posts. People can reply to those posts. Other people can reply to those replies - and so on. What does that look like in the API? Here's a quick guide to the concepts you need to know - and some code to help you visualise conversations.
When you scroll through the website, you normally see a list of replies. It looks like this:
Because it acts as a one-dimensional list, there's no easy way to figure out which post someone is replying to.
The data structure underlying the conversation is quite different. It actually looks like this:
Concepts
In Mastodon's API, a post is called a status
.
Every status on Mastodon has an ID. This is usually a Snowflake ID which is represented as a number.
When someone replies to a status on Mastodon, they create a new status which has a field called in_reply_to_id
. As its name suggests, has the ID of the status they are replying to.
Let's imagine this simple conversation:
- Ada: "How are you?"
- Bob: "I'm fine. And you?"
- Ada: "Quite well, thank you!"
Message 2 is in reply to message 1. Message 3 is in reply to message 2.
In Mastodon's jargon, message 1 is the ancestor of message 2. Similarly, message 3 is the descendant of message 2.
TEXT → Descendants →
1--------2-------3
← Ancestors ←
Branches
Now, let's imagine a more complicated conversation - one with branches!
TEXT1. Alice: What's your favourite pizza topping?
├── 2. Bette: Pineapple
│ ├── 4. Chuck: You make me sick!
│ └── 7. Dave: Yeah, I love pineapple too
└── 3. Chuck: Mushroom are the best
├── 5. Alice: Really?
│ └── 6. Dave: Button mushrooms are best!
└── 8. Elle: I like them too!
As you can see, people reply in threads. In this example, 2
is a different "branch" of the conversations than 3
.
It looks a bit more complicated with hundreds of replies, but that's it! That's all you need to know!
API
If you want to download a single status with an ID of 1234
the API call is /api/v1/statuses/1234
If you want to download a conversation, it is a little bit more complicated. Mastdon's API calls a conversation a context
Let's take the above simple example - Ada and Bob speaking. Ada's first status has an ID of 1
. To get the conversation, the API call is /api/v1/statuses/1/context
That returns two things:
- A list of
ancestors
. This is empty because1
is the first status in this conversation. - A list of
descendants
. This contains statuses2
and3
.
You will note, the context
does not return the status 1
itself.
Let's suppose that, instead of asking for the context of status 1
, we instead asked for 2
. This would return:
- A list of
ancestors
. This contains status1
. - A list of
descendants
. This contains status3
.
What about if we asked for 3
? This would return:
- A list of
ancestors
. This contains status1
and2
- A list of
descendants
. This is empty because3
is the last message in this conversation.
Branches
When it comes to complex threads - like the pizza example - things become a bit more difficult. Let's see the example again:
TEXT1. Alice: What's your favourite pizza topping?
├── 2. Bette: Pineapple
│ ├── 4. Chuck: You make me sick!
│ └── 7. Dave: Yeah, I love pineapple too
└── 3. Chuck: Mushroom are the best
├── 5. Alice: Really?
│ └── 6. Dave: Button mushrooms are best!
└── 8. Elle: I like them too!
Suppose we ask for the context
of the message with ID 5
. This would return:
- A list of
ancestors
. This contains statuses1
and3
- A list of
descendants
. This contains status6
.
That's it!?!? Where are the rest? They are part of a different conversation branch. Even status 8
isn't returned because it's a reply to 3
, not 5
.
In order to get the full conversation, we need to be sneaky!
The list of ancestors
contains the first message in the conversation. So we can grab that, and then call context
again for its ID.
Let's dive into some Python code to see how it works.
Code
This uses the Mastodon.py library for calling the Mastodon API and the Python treelib to create a conversation tree data structure.
This code connects to Mastodon and receives the status for a single ID.
Python 3from mastodon import Mastodon
from treelib import Node, Tree
mastodon = Mastodon( api_base_url="https://mastodon.example", access_token="Your personal access token from your instance" )
status_id = 109348943537057532
status = mastodon.status(status_id)
Getting the conversation means calling the context
API:
Python 3conversation = mastodon.status_context(status_id)
⚠ Note: Calling the context
on a large thread may take a long time. The longer the conversation, the longer you'll have to wait.
If there are ancestors, that means we are only on a single branch. The 0th ancestor is the top of the conversation tree. So let's get the context
for that top status:
Python 3if len(conversation["ancestors"]) > 0 :
status = conversation["ancestors"][0]
status_id = status["id"]
conversation = mastodon.status_context(status_id)
Next, we need to create a data structure to hold the conversation. We'll start by adding to it the first status in the conversation:
Python 3tree = Tree()
tree.create_node(status["uri"], status["id"])
Finally, we add any replies which are in the descendants
. It is possible that some earlier statuses have been deleted. So we won't add any status which are replies to deleted statuses:
Python 3for status in conversation["descendants"] :
try :
tree.create_node(status["uri"], status["id"], parent=status["in_reply_to_id"])
except :
# If a parent node is missing
print("Problem adding node to the tree")
That's it! Let's show the tree:
Python 3tree.show()
Here's what it should look like:
TEXT2022-11-14 20:02 Edent: Today I was meant to be flying in to San Francisco to attend Twitter's Developer Conference - Chirp.Twitter had paid for my flights and hotel, because I was one of their developer insiders. I planned to spend the week meeting friends old and new.Instead, Alan the Hyperprat canceled the conference. So I'm staying in the UK.So I'm going to spend the week hacking on Mastdon's #API and building cool shit. That'll show him!You can see what I'm working on at https://shkspr.mobi/blog/2022/11/building-an-on-this-day-service-for-mastodon/ https://mastodon.social/users/Edent/statuses/109343943300929632
├── 2022-11-14 20:10 Edent: Oh! And I was meant to be attending a Belle & Sebastian gig tonight. I canceled those tickets for I could fly to SF.So far, I reckon Alan's acquisition of Twitter has cost me close to £190.Wonder if he's good for the money? https://mastodon.social/users/Edent/statuses/109343972435801664
│ ├── 2022-11-14 20:14 thehodge: @Edent reminds me of the time I was booked to speak at a conference in Munich and I excitedly booked a behind the scenes tour of the worlds largest miniature city!Then the company went under!Gutted. https://mastodon.social/users/thehodge/statuses/109343989481494630
│ ├── 2022-11-14 21:16 Janiqueka: @Edent the way my bill for him keeps increasing https://mastodon.online/users/Janiqueka/statuses/109344233355230523
│ ├── 2022-11-14 21:19 henry: @Edent I was due to be at B&S tomorrow but it’s been postponed again.. not sure if that makes it better or worse for you! https://social.lc/users/henry/statuses/109344244402822729
│ │ └── 2022-11-15 04:53 Edent: @henry again!? Ah well!Hope you get to see them soon. https://mastodon.social/users/Edent/statuses/109346031194446940
│ ├── 2022-11-15 09:18 Amandafclark: @Edent send him an invoice :) https://mastodon.social/users/Amandafclark/statuses/109347071811426672
│ └── 2022-11-15 11:29 Edent: One of the #MastodonAPI projects I'm working on is a better way to view long & complex threads.You may have seen me build something similar for the other site a while ago - demo at https://shkspr.mobi/blog/2021/09/augmented-reality-twitter-conversations/ - so I'm hoping I can do something similarly interesting.Main limitation is getting *all* of the conversation threads. It looks like the context API isn't paginated. But I might be being thick. https://mastodon.social/users/Edent/statuses/109347587353822637
│ ├── 2022-11-15 11:36 bensb: @Edent Excellent project. You might have seen, but there's also this feature request for better 🧵 handling: https://github.com/mastodon/mastodon/issues/8615 https://genomic.social/users/bensb/statuses/109347612990393791
│ ├── 2022-11-15 11:39 Edent: Cor! That @katebevan is good for engagement! Look at all those conversations she's kicked off! https://mastodon.social/users/Edent/statuses/109347627634008550
│ │ ├── 2022-11-15 11:58 Edent: Indeed, how could they be?That means that ID of a reply is different depending on where you see it.So the ID of this post is:mastodon. social /@ edent/ 123456But when you see it on your server, it might appear as:your. server /@ edent/ 987654The #MastodonAPI copes with this really well. But it is a mite confusing to get one's head around. https://mastodon.social/users/Edent/statuses/109347703064222520
│ │ │ ├── 2022-11-15 12:02 erincandescent: @Edent the numeric IDs are not part of the protocol - it's all URL based. Pleroma uses UUIDs for example https://queer.af/users/erincandescent/statuses/109347716173491502
│ │ │ │ └── 2022-11-15 12:06 Edent: @erincandescent oh! That's interesting. Thanks. https://mastodon.social/users/Edent/statuses/109347734283971306
Once you have a tree, you can format the contents however you like.
Grab the code
You can download the code for my Mastodon API tools from CodeBerg. Enjoy!
James Ashford says:
Great work. I really like what you've done here. Just having a look at your code, shouldn't the "if len(conversation["ancestors"]) > 0 :" be "while len(conversation["ancestors"]) > 0 :"?
This way you can ensure that you're traversing all the way back up rather than just the first layer.
More comments on Mastodon.