May 042012
 

If you follow me on Twitter, you’ve probably seen my Scoop.it topic posts. I’ve been on Scoop.it since mid-January, and I recently signed up for the free trial of the “Pro” version. I’m planning to continue with the Pro version, which features analytics and up to ten topics; the free version only allows five topics and has no analytics.

What do I like about Scoop.it? First of all, it’s easy to set up a topic and start collecting articles. You simply enter search keywords and Scoop.it searches Twitter, YouTube, Digg, Google Blogs and Google News feeds for matches. You can add any RSS feed, search a Twitter user, list or search stream. You can add a Facebook page or import an OPML file.

There’s a bit of art to selecting the keywords. Too specific gets you few articles and too general gets you too many. For example, for my topic “Social Media Analytics and US Politics“, I started with a specific story about Facebook sharing data with Politico. This was too specific; I later added keywords for “social media analytics” and political terms like “President” and “election.” This yielded far too many hits, so I dropped the political terms and filtered out the non-political stories by hand. Someone with more experience in search engine keyword analysis could probably do this much faster than I did.

Once the collection is in progress, as a curator, you receive a stream of potential “scoops”, presented with the newest entry first. You can either discard or accept each suggestion. If you accept it, the “scoop” is pushed onto the top of your topic, displacing older entries. There are numerous controls for resizing and repositioning your scoops on the page. There’s a comment feature, although I have yet to receive any comments on my scoops.

There’s a “Star” option that will push a scoop to the top left position. There’s a “re-scoop” button that allows you to take any scoop from one of your topics or anyone else’s topics and add it to your own. You can share individual scoops on Twitter, LinkedIn, Pinterest and StumbleUp and whole topics on Twitter and Facebook. You can also share to Tumblr and WordPress blogs. And there’s a bookmarklet you can use to capture scoops while browsing.

I’m not going to say much about the analytics, because they aren’t part of the free service and I’ve only been using them for a week. I will say, though, that if you’re serious about the platform you’ll at least want to do the free trial and check them out for yourself. There’s also an API, and Google+ integration is coming.

Finally, I want to say a bit about the Scoop.it network. So far, it seems to be joyfully spam-free. I follow about 100 topics and have discovered some interesting people on Scoop.it that I probably would not have discovered by chance on Twitter. My most popular topic, “Computational and Data Journalism“, has about 35 followers. I don’t know how rapidly the network is growing, though. I’m planning to stay with it at least a few more months, and I invite everyone to come join me, even if it’s only at the free level.

Apr 052012
 

Computational Journalism Server: SUSE Gallery Download Page

Computational Journalism Server: Github Project

Data Journalism Developer Studio Users Google Group


I’ve just published release 0.2.1 of the Computational Journalism Server to the SUSE Gallery. If you’re interested in beta testing it, please join the Data Journalism Developer Studio Users Google Group.

The Computational Journalism Server is a spinoff / refactoring of the Data Journalism Developer Studio. As I noted last month, it makes no sense for me to maintain and re-distribute a Linux desktop and desktop tools when 80 percent of my users already have a perfectly good non-Linux desktop where they can run those tools! So the plan is to migrate the server-based software from the original appliance into the new server appliance and remove it from the desktop appliance.

In addition, the server appliance is going to evolve to function as a node in a grid / cluster / cloud infrastructure. I’m hoping to eventually package it as an OpenStack compute node. The server appliance will be focused on the R language, CRAN library packages and task views, and whatever Linux packages are required to support the R environment. There are plenty of other platforms out there for Rails, Spring, Node.js, Django, and so forth, but I haven’t seen anything specifically for people who want to develop in R.

The core appliance at the moment consists of the following components:

  • openSUSE 12.1 64 bit server base,
  • The ATLAS high-performance linear algebra library,
  • The R-patched distribution of R. This updates frequently and consists of patches on top of the most recent stable release,
  • The PostgreSQL and SQLite3 databases,
  • The Redis data structure store,
  • R web servers:- Rapache, Rook, websockets, and R Server Pages,
  • The RStudio Server IDE,
  • The Natural Language Processing, Reproducible Research and High Performance Computing task views.

I’ll be posting more documentation on getting started with the Computational Journalism Server in the next few days. I plan to add the Spatial task view in the next week but have no plans for any more task views in the near future. The enhancements / bug fixes I am working on include

  • Packaging as an OpenStack compute node,
  • Rebuilding ATLAS and R-patched from source tuned to the server hardware,
  • Fixing some underlying dependency issues in the High Performance Computing task view,
  • OpenCL integration on NVidia hardware, and
  • Demos of the web server capabilities.
 Posted by at 14:17