Skip to content


Collecting Historical Twitter Data For A Topic – TopicTweetHistory.pl

Follow @znbeta To Sign up for Social Media Analytics Research Toolkit Private Beta!

I am releasing the Perl script I run to collect historical Twitter data for a topic, TopicTweetHistory.pl, as an open source project, under the Artistic License. This is the same license that Perl uses. I’ve been using this script or variations of it since before the Haiti earthquake, so it’s pretty well tested.

The master repository is on Github at http://github.com/znmeb/TopicTweetHistory. As far as I know, TopicTweetHistory.pl will run on any modern version of Perl, but I’ve only tested it on Perl 5.10 on openSUSE Linux 11.2, and with ActiveState ActivePerl on Windows. If you have any trouble running it, please feel free to send me a tweet @znmeb.

How does it work? The first thing TopicTweetHistory does is open a browser window to Advanced Twitter Search. You simply build your Twitter Search query there, then when you are getting the results you want, copy the query string and paste it back as input to TopicTweetHistory. The script performs the Twitter Search back in time and delivers all tweets that match the query. You get the results in a comma-separated-value (CSV) file, which you can then open in a spreadsheet. There are more details on running the script at http://github.com/znmeb/TopicTweetHistory/blob/master/README.

Again, please feel free to send me a tweet if you need help getting this running. And special thanks to Marc Mims (@semifor), who has developed the Net::Twitter Perl module that interfaces with the Twitter API!

  • Twitter
  • Technorati Favorites
  • DZone
  • Share/Bookmark

Posted in Uncategorized.

Tagged with , , , , , , , , , , , , , , , , , , , , , , , , , .


3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Sewthomewhodo says

    Hi
    I am a newbie here.
    Glad to find this forum…as what I am looking for

Continuing the Discussion

  1. Ole linked to this post on 2010/03/07

    Borasky Research Journal: Collecting Historical Twitter Data For A Topic – TopicTweetHistory.pl http://bit.ly/dCfxo1

  2. Social Media Analytics Research Toolkit (SMART@znmeb) Is Moving Into Private Beta « Borasky Research Journal linked to this post on 2010/03/31

    [...] Advanced search query historical search to CSV file [...]



Borasky Research Journal is Digg proof thanks to caching by WP Super Cache