May 102012
 

Version 1.2.5 Released

I’ve just pushed all the buttons to release Computational Journalism Server 1.2.5. It’s mostly bug fixes and miscellaneous cleanup changes, but there is one new major option: Apache™ Hadoop™. Right now, all that’s there is a script to download and install the latest stable Hadoop from Apache and run the single-node test script. But it should be enough for developers to start testing the R Hadoop interface routines ‘HadoopStreaming‘ and ‘hive‘. See Parallel R for some sample code using ‘HadoopStreaming’ and ‘hive’.

To install Hadoop, do the following:

  1. Log into the server as “root”.
  2. Type “cd ~/Computational-Journalism-Server/Hadoop”.
  3. Type “./install-hadoop.bash”.
  4. Type “./test-hadoop.bash”.

This should install Hadoop and run the single-user test. For more information on configuring larger-scale Hadoop clusters, see the main documentation page at http://hadoop.apache.org/common/docs/r1.0.2/. The scripts in this release came from the Single Node Setup page.

Road Map

As a few recent posts on this blog have noted, I’m planning to migrate the platform components of Computational Journalism Server to either CloudFreeStyle or OpenShift or both, to take advantage of their existing platform-level components and community support structures. I don’t have a good estimate of dates yet, but there will be at least one more release as an openSUSE appliance before there’s any OpenShift or CloudFreeStyle release. There will also be one more release of Data Journalism Developer Studio 2012LX to catch up to the openSUSE Build Service packages.

May 052012
 

First of all, let me put this in perspective. I’ve been using Linux on workstations and laptops since Red Hat Linux 6.2. I stayed with Red Hat all the way through Red Hat Linux 9. When Red Hat split the distribution into Red Hat Enterprise Linux and Fedora Core in 2003, I switched to Debian. I ran Debian for about six months, then switched to Gentoo Linux. In the summer of 2008, I switched to openSUSE Linux and I’ve been on openSUSE since then.

Every time one of the major community Linux distributions ships a new stable release, I try it out. So far, none of the Debian, Fedora, Ubuntu or Mint releases has come out significantly better than openSUSE, so I’ve stuck with it. And that remains true for Ubuntu 12.04 LTS “Precise Pangolin”. If that were the end of the story, I could close this blog post now. But it’s not.

If you’ve been following this blog and my Twitter stream and Github account, you’ll know that I’ve been collecting tools for computational journalism and packaging them as appliances. And I’m moving on towards a Platform as a Service. One of the requirements I’ve put on that is that the tools should be distribution-agnostic as much as possible. Up to now, everything has been on openSUSE because of the SUSE Studio appliance construction tools and to a lesser extent the openSUSE Build Service package repositories. But I’ve come to the point where I need to make things work on Fedora and Ubuntu.

So I’ve quad-booted my laptop (Windows, openSUSE, Fedora 16 and Ubuntu 12.04). And I’m trying to triple-boot my workstation with openSUSE, Fedora and Ubuntu. Which brings us to the first problem – openSUSE and Fedora installed cleanly on the workstation, but Ubuntu 12.04 didn’t. In particular, the Ubuntu desktop doesn’t even come up on a 1024×768 monitor!

I can understand Linux not coming up on a wireless card that’s relatively new. I can understand Linux having trouble with a touchpad or with audio. After all, the hardware makers design for Windows and Apple, not Linux desktops / laptops. But a 1024×768 monitor that’s run everything from Gentoo / WindowMaker to KDE 3.5 to KDE 4 to GNOME 2 and GNOME 3 and LXDE and Cinnamon on openSUSE? A 1024×768 monitor that runs Fedora 16 without any problems? That’s just plain wrong!

I did get the Ubuntu desktop working on the laptop, which is a much newer configuration. I’m not going to spend a great deal of time on how ugly the desktop actually is when it works. That’s been covered in numerous places and desktops are

  1. A matter of personal taste, and
  2. Customized to the user’s workflow.

But for someone who, like me, is used to the GNOME 2 desktop as delivered in previous versions of Ubuntu and Fedora, the openSUSE customization of GNOME 2 and the current clean implementations of GNOME 3 on openSUSE and Fedora, Ubuntu’s Unity desktop is jarring. And it’s really hard to figure out how to do things, where stuff is, and so on.

Moreover, the whole distribution is “pushy” – it’s hawking subscriptions to Ubuntu One cloud music, for example. The software installer has favorite apps, and so on. It’s like having a Kindle Fire or an iPad or visiting the Chrome Web Store or Google Play – the Ubuntu desktop is trying to sell you something every time you move your mouse. Ubuntu has turned the Linux desktop into just another media consumption device!

That’s two strikes – annoyances but not deal-breakers. But what I want to do with Fedora and Ubuntu is use them as hosts for virtual appliances, just like I use openSUSE and Windows / VirtualBox now.. In openSUSE and Fedora, I can go into the software installer and select a “pattern” and get everything I need to do that. If Ubuntu has that, it’s well hidden under the games and the productivity suites and the media apps. Sure, I can go find how to do that on Ubuntu on the web, but it seems to be going against the grain of the distribution. It only took me two minutes to find it on Fedora after almost four years of working daily on openSUSE!

I’m sure “Precise Pangolin” is a fine distribution “under the hood.” The previous long-term support version, 10.04, is an acknowledged workhorse in servers along with Debian, RHEL/CentOS/Scientific Linux and SLES. I have to test on it, and I’ll figure out how to be productive at it. But if Canonical can’t come up with a desktop built for Linux professionals like me, they’re going to lose us.

Apr 272012
 

Computational Journalism Server: SUSE Gallery Download Page

Computational Journalism Server: Github Project

Data Journalism Developer Studio Users Google Group


Computational Journalism Studio 1.0.0 is now available in the SUSE Studio Gallery at http://j.mp.compjournoserver. It’s not exactly a full Platform-as-a-Service yet, which is the eventual goal, but just about all the components I want in the appliance are there. I’ve put fairly detailed instructions on getting started on the SUSE Gallery download page and they’re duplicated on the Github project page. If you run into difficulties, please feel free to comment here, send me a tweet, post an issue on Github, make a comment on the appliance download page, or rant on the Google Group. No carrier pigeons, please – I’ll just cook them for dinner.

I’ve got fairly big plans for the evolution of this tool set. In rough calendar order but with no firm dates yet, they are:

  • Get the major use cases described in Parallel R working,
  • Get the R integration with the ATLAS libraries working by default,
  • Get the appliance working as an lxc Linux Containers guest,
  • Package the appliance so it can function as an OpenStack Essex compute node,
  • Port the workstation development / testing environment to Ubuntu 12.04 and Fedora 17, and
  • Port the server to CloudFreeStyle and OpenShift Platform-as-a-Service environments.