Twitter API - pagination and IDs
Looking for some Twitter API help. Bit of a geeky post, this...
Pagination is the act of splitting data into logical pages. Suppose I had a list of item, numbered 0 - 99. If I want 20 items per page, it's trivial to see that pagination looks like:
p1 = 0-19 p2 = 20-40 p3 = 41-61 p4 = 62-82 p5 = 83-99
If I wanted to start at, say, page 55 - pagination would look like:
p1 = 55-75 p2 = 76-96 p3 = 97-99
Easy, right? So why am I telling you this?
Twitter Timeline
Imagine that those items are Twitter Status ID. Each one represents a tweet in your timeline.
Twitter will allow us to "page" back and forth through our timeline. If we say status ID 80 is the most recent post in our timeline, and we want to see 20 tweets at a time, pagination would look like this.
p1 = 80-60 p2 = 60-40 ... etc.
Normally, that would be fine.
The only issue is that friends are posting all the time. Imagine we start with tweets 80-60. We go to page 2, but in the meantime, 5 new tweets have been made.
p1 = 80-60 p2 = 65-45
The user sees 5 tweets she has already read. Not desirable.
If 20 tweets had been made before clicking on the "next" button, this is what happens.
p1 = 80-60 p2 = 80-60
Max_ID To The Rescue (AKA, the easy bit)
Luckily, Twitter allows us to use a max_id parameter in our API calls. This says "Get the tweets older than this number."
http://api.twitter.com/1/statuses/home_timeline.json?max_id=123465789
So, using max_id we can ensure that the user never has to read the same tweet twice. Instead of dumbly using pages, we call the specific tweets we want.
p1 max_id=80 = 80-60
p2 max_id=60 = 60-40
Easy! We just use the oldest tweet on the page as the max_id parameter when we call the next page.
Looking To The Future (AKA, where it all goes horribly wrong)
So far, we've looked at stepping back in time. Seeing older tweets. Suppose we want to see newer tweets?
Twitter provides us with a since_id parameter for API calls. This says "Get the tweets newer than this number."
Unfortunately, it doesn't work. Well, it works but not the way I expected it to!
Suppose our user is deep down in her tweets, this is how I would expect since_id to work
max_id=60 = 60-40
(So, let's show any more recent tweets)
since_id=60 = 80-60
We see the 20 tweets that occured since the since_id. Right? Wrong! This is what happens?
max_id=60 = 60-40
(So, let's show any more recent tweets)
since_id=60 = 100-80
What?
An Explanation
The since_id retrieves tweets starting with the most recent. It stops when it reaches the since_id.
I don't know the max_id that I'm looking for, so I can't call that.
I could call the most recent 200 tweets and look for the 20 I need. That's wasteful in terms of bandwidth and processing - there's also no guarantee that the since_id will be in there.
So, I have a problem. The "Older" link in my Twitter application will work. The "Newer" links won't.
Any suggestions?
Riccardo says:
B Zion says: