Join Me At Social Fresh Portland!

social media conference
Viralheat

About SMART@znmeb

What Is SMART@znmeb?

  • SMART@znmeb is a Social Media Analytics Research
    Toolkit
  • An appliance – a Virtual Machine Image
  • Platform is an Open Source Project on Github:
    http://github.com/znmeb/twitter-appliance
  • Application software is not open source!

What’s In The Machine?

  • A complete Linux® desktop
    • openSUSE® 11.1 operating system
    • Gnome® 2.24 desktop
    • Mozilla Firefox® 3.0 browser
    • Evolution® 2.24.1.1 email / contact
      / calendar management package
    • OpenOffice® 3.0 office suite
    • Pidgin® 2.5.1 instant messaging / IRC
      client
    • Games, multimedia, imaging

And?

  • PostgreSQL® 8.3.7 relational database
    management system, including PgAdmin3
  • R 2.9.2 language and environment for statistical and graphical computing
  • GGobi 2.1.8 data visualization system, and
  • Perl Net::Twitter::Lite Twitter API module / WWW::Mechanize “web
    scraper”

And?

  • Ruby 1.8.7
  • Python 2.6
  • Tcl/Tk 8.5
  • But primary implementation language will be Perl (5.10)
    • CPAN is your friend!

What’s Not In The Machine?

  • Gumballs, soda, peanuts, meat by-products,
  • Lua, Rails, Django,
  • KDE or XFCE desktops, or
  • Software with non-free-as-in-freedom licenses

Licensing

What Can You Do With It?

  • Anything you can do with openSUSE 11.1 Gnome desktop, and …
  • Interact automagically with the social web
  • Manage data
  • Analyze data

Interact Automagically With The Social Web

  • Collect data, social media or otherwise
  • Create Twitter bots
  • Build monitors, alerts and dashboards

Manage Data

  • Evolution Data Server,
  • OpenOffice Base,
  • PostgreSQL,
  • SQLite,
  • YAML, JSON, and CSV

Analyze Data

  • Perl analysis libraries are available, but not currently installed
  • R & GGobi
    • CRAN library packages and task views are available
    • Currently has Natural Language Processing and some others
  • Create static & animated visualizations of your data, presentation-quality
    graphics

Intended Analysis Domains

  • Natural language processing
  • Machine learning
  • Exploratory data analysis
  • Geospatial analysis
  • Data visualization

Why An Appliance?

  • Can be run as a guest inside a Windows, MacOS X or Linux desktop /
    laptop virtualizer
  • Can be backed up, duplicated, distributed and interchanged as
    a whole
    , complete with data & documents
  • Can be deployed as a server in the cloud

Why Are Twitter Text Analytics Different?

  • Multiple human languages
  • Links
  • Hashtags
  • @replies
  • “RT”
  • Tweets are an emerging / evolving language!
    • Example: people changed location and timezone to Tehran in response
      to #iranelection
  • That’s why my focus is on sampling the public timeline and exploratory
    data analysis

Status

  • VMware image available now
    • Runs with Linux or Windows host, don’t know about MacOS X
    • Also runs with VirtualBox®
  • Data collection focused on Twitter
    • “spritzer”, “track” and “follow” streaming API feeds
      work now
    • Collectors for friends and followers work now
    • Twitter search collector works now
  • Analysis focused on text analytics

References