Data Journalism Developer Studio 1.2.0 – Open Source Bridge / Indie Web Camp Edition
Release 1.2.0 of the Data Journalism Developer Studio is now available in the SUSE Gallery. Changes:
- I fixed the Rubygems and RPy2 install scripts. They didn’t work in 1.0.0.
- I added the ‘calibre” eBook publishing / library maintenance tool to the Multimedia install scripts.. Thanks to Michelle Anderson at I Heart Media for suggesting it.
- I addef the rrdf Resource Description Framework library package and the SPARQL SPARQL client to the R Natural Language Processing install scripts. I’m hoping to interface these to the Tracker RDF data store.
- One of the major themes at Open Source Bridge, reflecting a major theme of today’s technology, is so-called “NoSQL” databases. So I’ve added the NoSQL databases that are available from the openSUSE Build Serviceas RPMs:
Yes, There’s A Bubble In LinkedIn Stock!
About Data Journalism Developer Studio
In all the technology news last week, you might have missed this story. I only saw it mentioned on Reuters, not on any of the major technology blogs that I read. As is my usual practice when I see a technology story that matches my interests, I try to locate the original sources and post links on Twitter. So in case you missed those, here they are:
There’s a fair amount of technical detail about the model in the paper cited in my second tweet. If you want even more, the model itself is documented here:
So what’s the story here? From “Is There a Bubble in LinkedIn’s Stock Price?”:
It has been well documented in the financial press that a methodology is needed that can identify an asset price bubble in real time. William Dudley, the President of the New York Federal Reserve, in an interview with Planet Money [3] stated “…what I am proposing is that we try to identify bubbles in real time, try to develop tools to address those bubbles, try to use those tools when appropriate to limit the size of those bubbles and, therefore, try to limit the damage when those bubbles burst.”
It is also widely recognized that this is not an easy task. Indeed, in 2009 the Federal Reserve Chairman Ben Bernanke said in Congressional Testimony [1] “It is extraordinarily difficult in real time to know if an asset price is appropriate or not”.
Here’s a link to the William Dudley interview, and one to Bernanke’s testimony.
Professor Jarrow and his colleagues took up the challenge laid down by the Federal Reserve Board. The model they have devised is quite complex, involving stochastic differential equations and reproducing kernel Hilbert spaces. They tested this model on stock price data from “the alleged internet dotcom bubble (and beyond), from 1999 to 2005.” While there will no doubt be much more peer review of the data, model and conclusions, the test shows promise. Moreover, it can be applied to the price of any publicly-traded stock. The test has three possible results:
- There’s definitely a bubble.
- There’s definitely not a bubble.
- No conclusion about a bubble can be drawn from the data.
So now we come to LinkedIn. LinkedIn was publicly traded for the first time on May 19, 2011, using the symbol LNKD. Professor Jarrow and his colleagues obtained real-time price data from Bloomberg for the first four days of trading and applied their model. And their claim is quite definitive:
We have found, definitively, that there is a price bubble!
While the technology is certainly interesting in its own right, at least to data journalists like myself, what are the wider implications of this? First of all, the context of the Dudley interview was the Finance / Insurance / Real Estate (FIRE) sector and the holdings of the Federal Reserve Board in that industry. As we all know, the Great Recession we discuss on a daily basis originated in the FIRE sector.
The context of the model Jarrow, et. al., have created, on the other hand, is publicly-traded stocks. In particular, the model was initially tested on Internet stocks during a well-documented bubble, and applied to a social media stock within days of its initial public offering. Moreover, the model should work in real time. Given a live data feed and enough computing capacity, it should be possible to monitor data and make investment decisions in real time.
Even though the model is designed for real-time publicly-traded stocks, it should be applicable to any financial time series that satisfies the underlying mathematical assumptions. This includes, for example, prices of shares in the “secondary markets” for companies like Facebook and Twitter. I haven’t attempted to implement the model yet – I’ve been away from computational finance for several years and I’m in the process of coming back up to speed on the methodologies. The core technologies are available in the Data Journalism Developer Studio, however, and if anyone is interested in working on this, send me a tweet @znmeb.
Data Journalism Developer Studio 1.0.0 Released!
I’ve just pushed release 1.0.0 of the Data Journalism Developer Studio into the SUSE Gallery. Changes:
- The base appliance ships with Mozilla Firefox as the browser rather than Chromium. Chromium is available as an add-on installation script set. This was a difficult decision for me to make, but the version of Chromium in the Open Build Service is 13.0.xxx, which is updated frequently and can be unstable. This is roughly equivalent to Google’s “Canary” build on Windows and Macintosh. Chromium was proving too unstable for regular use, so I replaced it with Firefox.
- I added CoffeeScript to the install scripts for node.js and NowJS. If you’re a JavaScript developer, I welcome more suggestions for node.js packages.
I’m planning to open the project up to other developers in the near future. Now that the Fundry feature request mechanism is in place, the road map is public. My own plan is to start building user-level documentation. Most of the software in the appliance is well-documented on its own, but there aren’t too many examples of application-level usage that I’ve been able to find.
Last year, I discovered a hoax on Wikipedia - The ‘глупо муравей’ story: Shostakovich, musique concrète, Wikipedia, bullshit and curation. With the help of my friends on a Shostakovich mailing list, I was able to get it corrected. Now I’d like to ask for your help getting another little piece of history documented in Wikipedia.
Years ago, I read a comment in a book on board games that Confucius had advised “the idle rich” to play weiqi (the game most of the world knows by its Japanese name, Go) rather than “let their minds stagnate.” I haven’t been able to track down that exact quotation. Wikipedia only says this:
Go originated in ancient China sometime before the 3rd century BC (exactly when is unknown), by which time it was already a popular pastime, as indicated by a reference to the game in the Analects of Confucius.
The closest I’ve been able to find in English translations is this:
Analects of Confucius – Ch.17 – 22/ Confucius said, “He who always has a full stomach but does nothing meaningful is simply a good-for-nothing. Is there not a game of chess? Even playing chess is better than idling the time away.”
Now there is a Chinese variant of chess, rarely seen outside of China. But I wonder – was Confucius talking about the Chinese version of chess, or was he talking about weiqi? The British have been rabid chess players for a long time, and perhaps earliest the translators substituted a game they knew well for a game they did not.
So the questions are:
1. Is the game referred to in Analects Chapter 17 really chess? Is it weiqi? Some other game?
2. Are there other references to either chess or weiqi in the writings of Confucius?
#ibmwatson? Are you up for this? Google Translate? LinkedIn Answers? Quora?
Update 2011-02-08: Twitter blogs about the Al Jazeera campaign.
Robin Sloan (@robinsloan) of Twitter has written a blog post detailing the Al Jazeera campaign. He confirmed that Al Jazeera is in fact watching the keywords and promoting tweets if the keywords become trending topics. Robin has some very nice graphics on the blog post showing the spikes in tweets per hour around the promotions, although they only show the tweet rate spikes, not the Promoted Tweet insertion points, the keywords, or any of the other detailed tracking that their analytics platform is capable of providing.
Speaking of the analytics platform, a little more detail about the underlying mechanisms surfaced last week on SlideShare. Kevin Weil, Twitter’s head of analytics, posted this presentation on Rainbird. It’s an interesting approach – a patch to the open-source Cassandra database to allow hierarchical counting.
Update 2011-02-07: Yet another Promoted Tweet using “Egypt” to get attention on Twitter. This time it’s “Trade King”, an online brokerage.
When does it end? After last night’s disgraceful Groupon commercials on the Super Bowl and last week’s Kenneth Cole tweet, I’m beginning to think there’s not much human misery left that someone won’t try to use to hawk their wares. Twitter, I think it’s time you planted a stake in the ground and said, “There are some search keywords we will not allow in Promoted Tweets. Egypt is the first, and there will be others.”
Update 2011-02-05: The Committee to Protect Journalists (CPJ, tweeting as @pressfreedom) has purchased a Promoted Tweet.
Update 2011-02-04: Al Jazeera has now purchased a Promoted Trend hashtag “#demandaljazeera”!
I see the story is being picked up now from various blogs. It’s going to be an interesting weekend, with the events in Egypt competing for our attention with Super Bowl XLV.
Twitter’s Promoted Trends clearly are a winner. Just in case nobody has reminded you of this lately, Al Jazeera and Twitter are both businesses. Twitter, the business, sold Al Jazeera, the business, advertising, just as they have sold advertising to Audi, Google and others recently. And the cable companies that Al Jazeera wants to start distributing Al Jazeera content are businesses, too. This is about money, pure and simple. This is about closing sales.
If you’re following the events unfolding in Egypt, I’m sure you’ve heard the major news stories, including the attempts by the Egyptian government to shut down cell phone and Internet communications. And I’m also sure by now you’ve heard of Al Jazeera English, which is, as the name suggests, the English-language service of Al Jazeera. For some background on Al Jazeera English, you can read these stories in the New York Times:
As the Times notes, Al Jazeera English’s images and stories are getting through, even though governments may be attempting to block them. But in addition, Al Jazeera English is actively using Twitter’s advertising mechanisms – Promoted Accounts and Promoted Tweets – to build a following on Twitter and market itself! Al Jazeera English has purchased Promoted Tweets on the major hashtags – #Egypt, #jan25, #Mubarak, #egipto – and other searches such as “Egypt” and “Mubarak” and even the names of some of the new cabinet ministers.
The following screen shots are typical.
What’s even more interesting is that Al Jazeera English has purchased a Promoted Account, which means it sometimes shows up at the top of “Who To Follow”:
I haven’t seen a Promoted Trend yet – perhaps Al Jazeera English marketing thought that would be tacky. And I’m not sure what to make of all of this, given Twitter’s blog post, “The Tweets Must Flow“. Perhaps better journalists than I will step forward and provide me with some clues in the comments. Meanwhile, follow the money.
Oh, yeah – while we’re on the subject - Yellow journalism: From Wikipedia, the free encyclopedia
For the benefit of those of you who aren’t on Twitter, Mercedes-Benz just started a Promoted Trends campaign today, using the hashtag “#MBtweetrace”. The Promoted Tweet links back to a Facebook page where the contest is being managed. Since I don’t watch television or follow the automotive industry, this was the first I heard about the campaign.
I was curious what agency was behind this campaign, so this morning, I asked the question on Quora: ‘What agency is running the Mercedes-Benz “Tweet Race” promotional campaign?’ I didn’t receive an answer, so I went to LinkedIn Answers this evening and asked the same question. Within 12 minutes I had an answer, from Michael Cirillo:
“RazorFish is taking the lead on the social media side with help from the normal MB agency Merkley+Partners.”
I did finally get an answer on Quora – in a private message, though. The question is still unanswered on the public site. And the Quora answer showed up about eight hours after I asked the question.
Sometimes, the old ways are best.



















