Join Me At Social Fresh Portland!
|
|
About SMART@znmeb
What Is SMART@znmeb?
- SMART@znmeb is a Social Media Analytics Research
Toolkit
- An appliance – a Virtual Machine Image
- Platform is an Open Source Project on Github:
http://github.com/znmeb/twitter-appliance
- Application software is not open source!
What’s In The Machine?
- A complete Linux® desktop
- openSUSE® 11.1 operating system
- Gnome® 2.24 desktop
- Mozilla Firefox® 3.0 browser
- Evolution® 2.24.1.1 email / contact
/ calendar management package
- OpenOffice® 3.0 office suite
- Pidgin® 2.5.1 instant messaging / IRC
client
- Games, multimedia, imaging
And?
- PostgreSQL® 8.3.7 relational database
management system, including PgAdmin3
- R 2.9.2 language and environment for statistical and graphical computing
- GGobi 2.1.8 data visualization system, and
- Perl Net::Twitter::Lite Twitter API module / WWW::Mechanize “web
scraper”
And?
- Ruby 1.8.7
- Python 2.6
- Tcl/Tk 8.5
- But primary implementation language will be Perl (5.10)
What’s Not In The Machine?
- Gumballs, soda, peanuts, meat by-products,
- Lua, Rails, Django,
- KDE or XFCE desktops, or
- Software with non-free-as-in-freedom licenses
Licensing
What Can You Do With It?
- Anything you can do with openSUSE 11.1 Gnome desktop, and …
- Interact automagically with the social web
- Manage data
- Analyze data
Interact Automagically With The Social Web
- Collect data, social media or otherwise
- Create Twitter bots
- Build monitors, alerts and dashboards
Manage Data
- Evolution Data Server,
- OpenOffice Base,
- PostgreSQL,
- SQLite,
- YAML, JSON, and CSV
Analyze Data
- Perl analysis libraries are available, but not currently installed
- R & GGobi
- CRAN library packages and task views are available
- Currently has Natural Language Processing and some others
- Create static & animated visualizations of your data, presentation-quality
graphics
Intended Analysis Domains
- Natural language processing
- Machine learning
- Exploratory data analysis
- Geospatial analysis
- Data visualization
Why An Appliance?
- Can be run as a guest inside a Windows, MacOS X or Linux desktop /
laptop virtualizer
- Can be backed up, duplicated, distributed and interchanged as
a whole, complete with data & documents
- Can be deployed as a server in the cloud
Why Are Twitter Text Analytics Different?
- Multiple human languages
- Links
- Hashtags
- @replies
- “RT”
- Tweets are an emerging / evolving language!
- Example: people changed location and timezone to Tehran in response
to #iranelection
- That’s why my focus is on sampling the public timeline and exploratory
data analysis
Status
- VMware image available now
- Runs with Linux or Windows host, don’t know about MacOS X
- Also runs with VirtualBox®
- Data collection focused on Twitter
- “spritzer”, “track” and “follow” streaming API feeds
work now
- Collectors for friends and followers work now
- Twitter search collector works now
- Analysis focused on text analytics
References
|
Open Source Twitter Research Tools
|