Which Twitter User Receives The Most Citations on Wikipedia?

I few days ago, I was somewhat surprised to find that one of my Tweets had been used as a citation in Wikipedia!

I began to wonder - how often are Tweets used in citations?

It's possible to search for your own Tweets using this (somewhat obscure) link:


Just edit the end of it to see if you, or your friends, have been cited. Note - the username is case sensitive, so "Edent" isn't the same as "edent".

For example, we can see where Cory Doctorow is cited:


Aha! The page on the New Zealand Internet Blackout references:

Ok, so which Twitter user has been cited the most? TO THE API, ROBIN!

Wikipedia's own help pages are a little lacking, so I went to the help pages of the software which runs Wikipedia - MediaWiki.

We want to search for external URLs which point to Twitter and have a namespace of 0 (that means they're articles, not talk pages). We can grab a maximum of 500 results at a time, using JSON, and we want to include "www.twitter.com" and "twitter.com". Here's what we use.


Run it yourself to see the results.


Crappy Python!

import urllib2
import json
from collections import Counter

api = "https://en.wikipedia.org/w/api.php?action=query&eunamespace=0&eulimit=500&format=json&list=exturlusage&euquery=*.twitter.com"

euoffset = 0

words = []

while euoffset < 17500:
        site_data = json.load(urllib2.urlopen(api + "&euoffset=" + str(euoffset)))
        #   Itterate through
        for element in site_data['query']['exturlusage']:
            #   Remove HashBangs #!
            #   Lowercase everything
            twitterURL = element['url'].replace("/#!","").lower()
            twitterUser = twitterURL.replace("http://twitter.com/","")
            twitterUser = twitterUser.replace("https://twitter.com/","")
            twitterUser = twitterUser.replace("@","")
            slash = twitterUser.find("/")
            if slash > 0:
                twitterUser = twitterUser[:slash]
            print twitterURL
            # print twitterUser
    except urllib2.URLError:
        print "Unable to retreive data"
    euoffset += 500

#   Most cited user
word_counts = Counter(words)
print word_counts

The Results...

The most cited Twitter users are...

  1. LaLiga A Spanish Football competition - 105
  2. Lea Michele An American singer / TV actor - 54
  3. Guldbaggen The Swedish equivalent of the Oscars - 35
  4. Kevin Tancharoen An American movie director - 12
  5. PRESTO card Ottawa's public transit smartcard - 10
  6. ICE T An American rapper - 10
  7. Northern Pride RLFC A British rugby team - 10
  8. 穴井勇輝(勇吹輝) A Japanese Actor - 9
  9. NICKI MINAJ An American singer - 8
  10. Counting Crows An American band - 8

And the most cited individual Tweet?


Linked to from lots of Lost Girl pages.

What Have We Learned Today?

Wikipedia does have a large amount of pop-culture (do we need hundreds of words on My Little Pony Characters?)

Twitter, unsurprisingly, has limited utility as an encyclopædic source - it's great for breaking news and ephemeral events, but it's fragile and lacks depth. There are very few occasions where Twitter would be the sole, and canonical, source of information [Citation needed].

3 thoughts on “Which Twitter User Receives The Most Citations on Wikipedia?

Leave a Reply

Your email address will not be published. Required fields are marked *