New Fedora Core RPMS for CRAN packages arm, Matrix, lme4, car, coda, leaps, and mlmRev

Posted by Tom Moertel Wed, 25 Apr 2007 18:07:00 GMT

Just a quick note for folks using the R statistics system on Fedora Linux. I have packaged for Fedora a bunch of R packages from the CRAN. (R packages have to be packaged again, as RPM packages, to integrate with Fedora Linux.)

My initial goal was to package arm, which contains tools for working with various regression models. (This package accompanies Andrew Gelman and Jennifer Hill’s wonderful book Data Analysis Using Regression and Multilevel/Hierarchical Models.) Packaging “arm,” however, quickly snowballed into packaging a bunch of prerequisites. Thankfully, I have now completed that task and can share the fruits of my labor with you.

All in all, to install “arm,” you will need the following RPMs:

  • R-arm-1.0-2
  • R-car-1.2-1
  • R-lme4-0.9975-1
  • R-Matrix-0.9975-1
  • R-R2WinBUGS-2.0-1

The following RPMs are optional (but you will need them if you want to rebuild the RPMs):

  • R-coda-0.10-1
  • R-leaps-2.7-1
  • R-mlmRev-0.995-1

You can download the packages from the RPMs section of the Community Projects site. Better yet, you can use Yum to download them for you. Just add the moertel-community Yum repository to your /etc/yum.repos.d directory (see RPMs for the recipe) and then use the following command:

$ sudo yum install R-arm

Yum will automatically resolve dependencies and install the required packages. If you want any of the optional packages, add them after “R-arm” on the command line.

I have built the packages for Fedora Core 6 on the x86_64 architecture, but the RPM specs are available if you want to rebuild the packages for other architectures. (See the instructions for rebuilding RPMs for help.)

Caveat: I’m not sure that the R-R2WinBUGS package is fully functional. It depends on BRugs, which doesn’t yet build on the Linux platform. To get around this problem, I made R-R2WinBUGS’s dependency on BRugs weak; the first package no longer requires the second to install.

Posted in ,
Tags , , ,
no comments
no trackbacks
Reddit Delicious

Engauge Digitizer: a handy tool for extracting data from charts

Posted by Tom Moertel Tue, 17 Apr 2007 07:45:00 GMT

Today I wanted to extract the data that were visualized in a chart I saw on Seth Roberts’s blog. That is, I had a picture of a data set, and I wanted the numbers behind the picture.

This task turned out to be surprisingly easy – once I found Engauge Digitizer, an open-source (GPL) tool made for this very task. After I launched Engauge, the digitization process was straightforward:

  1. I established the chart’s coordinate system by clicking in the corners and entering the associated coordinates.
  2. Then I had Engauge identify data points. With the mouse, I selected a data point by hand, teaching Engauge what a point looks like. Then Engauge identified spots on chart that looked like data points and locked on to them. I was able to step through the points to tell Engauge to skip the few it misidentified.
  3. I manually selected a few more data points that were scrunched into blobs and had eluded Engauge’s point-detection heuristics.
  4. Finally, I exported the data set in CSV format.

If you ever need to extract the data behind a chart, do check out Engauge Digitizer. (If you use Fedora Linux, you’ll be happy to know that I have packaged Engauge for you. Get it at the RPMs section of the community site.)

Posted in
Tags , , , , , ,
no comments
no trackbacks
Reddit Delicious

Netflix vs. Amazon Unbox: Netflix still wins

Posted by Tom Moertel Sat, 07 Apr 2007 16:20:00 GMT

When Amazon.com announced its its Unbox video-download service, I was skeptical. Compared to the reigning champion – the DVD – Unbox looked like a loser:

  • Unbox burdened its customers with DRM and the annoyances that come with DRM
  • Unbox required the use of a Windows-only player application
  • Unbox movies lacked “standard” DVD features such as surround sound, alternative audio tracks, commentaries, and bloopers

The first two points were deal-breakers, so I wrote off Unbox and did my best to ignore it.

And then Amazon hooked up with TiVo. Beaming movies directly into my TiVo box eliminates the need to deal with DRM and Windows annoyances. My two big concerns sidestepped, I decided to give Unbox another look. I still wouldn’t want to buy Unbox-to-TiVo movies because they lack the typical DVD extras and would tie up storage space on my TiVo, but Unbox might be a decent way to rent the occasional movie – if the price were right.

Is the price right?

That depends on how the price of Unbox compares with the price of my current rental option of choice, Netflix. Both services offer immediate access to good movies: Unbox by on-demand downloads, Netflix by ensuring that I almost always have a DVD or two in the house.

To compare Unbox with Netflix, I had to figure out how much a rental costs me with each service. With Unbox the figuring was easy because each rental has its own price tag, typically $3.99.

With Netflix, it’s a bit trickier because the rental price depends upon how many DVDs I rent in a month. I pay a monthly fee of $17.99 and can rent as many DVDs as I want, at least until the infamous Netflix rate throttle kicks in. To determine how many DVDs I rent during the typical month, I had to download my rental history. (If you’re a Netflix subscriber, you can get your history from the Returned Rentals page.) After downloading my history, massaging it into the desired form, and loading it into R, I generated a stem-and-leaf plot to visualize the number of DVDs I have rented during each of the 76 months I have been a Netflix subscriber:

> stem(monthly.rental.counts, scale=2)

  The decimal point is at the |

   1 | 0
   2 | 000
   3 | 0000000
   4 | 00000000000
   5 | 000000000000
   6 | 000000000000000
   7 | 0000
   8 | 000000
   9 | 00000
  10 | 0000
  11 | 0
  12 | 00
  13 | 00
  14 | 00
  15 | 0

It looks like I have rented as few as one and as many as fifteen DVDs in a month. Most months, however, I rent between three and ten DVDs. On average, I rent about 6.4 DVDs per month:

> summary(monthly.rental.counts)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  1.000   4.000   6.000   6.408   8.000  15.000

Thus my average rental price is about $2.80 per DVD:

> 17.99 / 6.4
[1] 2.810937

Now I can make my Unbox-vs-Netflix price comparison. For me, it looks like Unbox is about 40 percent more expensive than Netflix:

> 3.99 / 2.81
[1] 1.419929

So the price of Unbox is not right, at least for me.

Testing Unbox-to-TiVo rentals

Because Amazon is offering free $15 credits to TiVo owners, I decided to give Unbox a test drive. My test rental was The Illusionist. Renting the movie was easy (just one click), and shortly thereafter Unbox automatically downloaded the movie to my TiVo box. When I played the movie, however, I was disappointed with the video quality. I easily noticed banding artifacts, which were distracting at times. On the whole, the viewing experience was inferior to watching a DVD.

Netflix still beats Unbox

For me, then, Unbox is still a loser. It costs more and delivers less than DVD rentals via Netflix.

A note to my friends at Amazon.com

I would be happy to give you my business, but right now you’re not earning it. If you want me as an Unbox customer, here is the recipe for winning me over:

  • Let me easily download movie rentals to my TiVo. (Check.)
  • Offer true DVD quality or better. (You’re not there yet.)
  • Sell the rentals for less than $2.80. (You’re not there yet.)

Until then, I’ll have to give my money to Netflix.

Cheers,
Tom

Update: edits for clarity; added tags.

Posted in
Tags , , , , , , ,
no comments
no trackbacks
Reddit Delicious