Which Twitter User Receives The Most Citations on Wikipedia?


I few days ago, I was somewhat surprised to find that one of my Tweets had been used as a citation in Wikipedia!

I began to wonder - how often are Tweets used in citations?

It's possible to search for your own Tweets using this (somewhat obscure) link:

https://en.wikipedia.org/w/index.php?title=Special%3ALinkSearch&target=twitter.com%2Fedent

Just edit the end of it to see if you, or your friends, have been cited. Note - the username is case sensitive, so "Edent" isn't the same as "edent".

For example, we can see where Cory Doctorow is cited:

https://en.wikipedia.org/w/index.php?title=Special%3ALinkSearch&target=twitter.com%2Fdoctorow

Aha! The page on the New Zealand Internet Blackout references:

Ok, so which Twitter user has been cited the most? TO THE API, ROBIN!

Wikipedia's own help pages are a little lacking, so I went to the help pages of the software which runs Wikipedia - MediaWiki.

We want to search for external URLs which point to Twitter and have a namespace of 0 (that means they're articles, not talk pages). We can grab a maximum of 500 results at a time, using JSON, and we want to include "www.twitter.com" and "twitter.com". Here's what we use.

https://en.wikipedia.org/w/api.php?
action=query&
list=exturlusage&
eunamespace=0&
eulimit=500&
format=json&
euquery=*.twitter.com

Run it yourself to see the results.

Limitations

Crappy Python!

import urllib2
import json
from collections import Counter

#euoffset=17800
api = "https://en.wikipedia.org/w/api.php?action=query&eunamespace=0&eulimit=500&format=json&list=exturlusage&euquery=*.twitter.com"

euoffset = 0

words = []

while euoffset < 17500:
    try:
        site_data = json.load(urllib2.urlopen(api + "&euoffset=" + str(euoffset)))
        #   Itterate through
        for element in site_data['query']['exturlusage']:
            #   Remove HashBangs #!
            #   Lowercase everything
            twitterURL = element['url'].replace("/#!","").lower()
            twitterUser = twitterURL.replace("http://twitter.com/","")
            twitterUser = twitterUser.replace("https://twitter.com/","")
            twitterUser = twitterUser.replace("@","")
            slash = twitterUser.find("/")
            if slash > 0:
                twitterUser = twitterUser[:slash]
            print twitterURL
            # print twitterUser
            words.append(twitterUser)
    except urllib2.URLError:
        print "Unable to retreive data"
        sys.exit()
    euoffset += 500

#   Most cited user
word_counts = Counter(words)
print word_counts

The Results...

The most cited Twitter users are...

  1. LaLiga A Spanish Football competition - 105
  2. Lea Michele An American singer / TV actor - 54
  3. Guldbaggen The Swedish equivalent of the Oscars - 35
  4. Kevin Tancharoen An American movie director - 12
  5. PRESTO card Ottawa's public transit smartcard - 10
  6. ICE T An American rapper - 10
  7. Northern Pride RLFC A British rugby team - 10
  8. 穴井勇輝(勇吹輝) A Japanese Actor - 9
  9. NICKI MINAJ An American singer - 8
  10. Counting Crows An American band - 8

And the most cited individual Tweet?

https://twitter.com/SyfyPR/status/313791237121507329

Linked to from lots of Lost Girl pages.

What Have We Learned Today?

Wikipedia does have a large amount of pop-culture (do we need hundreds of words on My Little Pony Characters?)

Twitter, unsurprisingly, has limited utility as an encyclopædic source - it's great for breaking news and ephemeral events, but it's fragile and lacks depth. There are very few occasions where Twitter would be the sole, and canonical, source of information [Citation needed].


Share this post on…

5 thoughts on “Which Twitter User Receives The Most Citations on Wikipedia?”

  1. Elon Musk, in his wisdom [SIC] has decided to delete inactive Twitter accounts. Some reports say that's defined by having no activity in as few as 30 days.

    Hopefully, all the tweets cited in Wikipedia (like nearly all other web pages cited) will be in the Internet Archive's Wayback Machine.

    Reply

What are your reckons?

All comments are moderated and may not be published immediately. Your email address will not be published.Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre> <p> <br> <img src="" alt="" title="" srcset="">