Posted by Tom Moertel
Wed, 25 Apr 2007 18:07:00 GMT
Just a quick note for folks using the R statistics
system on Fedora
Linux. I have packaged for Fedora a
bunch of R packages from the CRAN. (R
packages have to be packaged again, as RPM packages, to integrate with
Fedora Linux.)
My initial goal was to package
arm,
which contains tools for working with various regression models.
(This package accompanies Andrew Gelman and Jennifer Hill’s wonderful
book Data Analysis Using Regression and Multilevel/Hierarchical Models.)
Packaging “arm,” however, quickly snowballed into packaging a bunch of
prerequisites. Thankfully, I have now completed that task and can
share the fruits of my labor with you.
All in all, to install “arm,” you will need the following RPMs:
- R-arm-1.0-2
- R-car-1.2-1
- R-lme4-0.9975-1
- R-Matrix-0.9975-1
- R-R2WinBUGS-2.0-1
The following RPMs are optional (but you will need them if you
want to rebuild the RPMs):
- R-coda-0.10-1
- R-leaps-2.7-1
- R-mlmRev-0.995-1
You can download the packages from the RPMs
section of the Community
Projects site. Better yet, you can use Yum to
download them for you. Just add the moertel-community
Yum repository to your /etc/yum.repos.d directory (see RPMs for the recipe) and then use the
following command:
$ sudo yum install R-arm
Yum will automatically resolve dependencies and install the required
packages. If you want any of the optional packages, add them after
“R-arm” on the command line.
I have built the packages for Fedora Core 6 on the x86_64 architecture, but the
RPM specs are available
if you want to rebuild the packages for other architectures. (See
the instructions for rebuilding RPMs for help.)
Caveat:
I’m not sure that the R-R2WinBUGS package is fully functional. It
depends on BRugs, which doesn’t yet build on the Linux platform. To
get around this problem, I made R-R2WinBUGS’s dependency on BRugs
weak; the first package no longer requires the second to install.
Posted in statistics, linux
Tags fedora, R, rpms, statistics
no comments
no trackbacks

Posted by Tom Moertel
Tue, 17 Apr 2007 07:45:00 GMT
Today I wanted to extract the data that were visualized in a
chart I saw on Seth Roberts’s blog. That is, I had a picture of a data set, and I wanted the numbers behind the picture.
This task turned out to be surprisingly easy – once I found Engauge Digitizer, an open-source (GPL) tool made for this very task. After I launched Engauge, the digitization process was straightforward:
- I established the chart’s coordinate system by clicking in the corners and entering the associated coordinates.
- Then I had Engauge identify data points. With the mouse, I selected a data point by hand, teaching Engauge what a point looks like. Then Engauge identified spots on chart that looked like data points and locked on to them. I was able to step through the points to tell Engauge to skip the few it misidentified.
- I manually selected a few more data points that were scrunched into blobs and had eluded Engauge’s point-detection heuristics.
- Finally, I exported the data set in CSV format.
If you ever need to extract the data behind a chart, do check out
Engauge Digitizer. (If you use Fedora Linux,
you’ll be happy to know that I have packaged Engauge for you.
Get it at the RPMs section
of the community site.)
Posted in statistics
Tags charts, data, fedora, plots, rpms, statistics, tools
no comments
no trackbacks

Posted by Tom Moertel
Sat, 07 Apr 2007 16:20:00 GMT
When Amazon.com announced its its Unbox video-download service, I was skeptical. Compared to the reigning champion – the DVD – Unbox looked like a loser:
- Unbox burdened its customers with DRM and the annoyances that come with DRM
- Unbox required the use of a Windows-only player application
- Unbox movies lacked “standard” DVD features such as surround sound, alternative audio tracks, commentaries, and bloopers
The first two points were deal-breakers, so I wrote off Unbox and did my
best to ignore it.
And then Amazon hooked up with TiVo. Beaming movies directly into my
TiVo box eliminates the need to deal with DRM and Windows annoyances.
My two big concerns sidestepped, I decided to give Unbox another
look. I still wouldn’t want to buy Unbox-to-TiVo movies because
they lack the typical DVD extras and would tie up storage
space on my TiVo, but Unbox might be a decent way to rent the
occasional movie – if the price were right.
Is the price right?
That depends on how the price of Unbox compares with the price
of my current rental option of choice, Netflix. Both services offer immediate
access to good movies: Unbox by on-demand downloads, Netflix by
ensuring that I almost always have a DVD or two in the house.
To compare Unbox with Netflix, I had to figure out how much a
rental costs me with each service. With Unbox the figuring was easy
because each rental has its own price tag, typically $3.99.
With Netflix, it’s a bit trickier because the rental price depends
upon how many DVDs I rent in a month. I pay a monthly fee of $17.99
and can rent as many DVDs as I want, at least until the infamous
Netflix rate
throttle
kicks in.
To determine how
many DVDs I rent during the typical month, I had to download my
rental history. (If you’re a Netflix subscriber, you can get your
history from the Returned
Rentals page.)
After downloading my history, massaging it into the desired form, and
loading it into R, I generated a
stem-and-leaf plot to visualize the number of DVDs I have rented
during each of the 76 months I have been a Netflix subscriber:
> stem(monthly.rental.counts, scale=2)
The decimal point is at the |
1 | 0
2 | 000
3 | 0000000
4 | 00000000000
5 | 000000000000
6 | 000000000000000
7 | 0000
8 | 000000
9 | 00000
10 | 0000
11 | 0
12 | 00
13 | 00
14 | 00
15 | 0
It looks like I have rented as few as one and as many as fifteen DVDs in a
month. Most months, however, I rent between three and ten DVDs. On
average, I rent about 6.4 DVDs per month:
> summary(monthly.rental.counts)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 4.000 6.000 6.408 8.000 15.000
Thus my average rental price is about $2.80 per DVD:
> 17.99 / 6.4
[1] 2.810937
Now I can make my Unbox-vs-Netflix price comparison. For me, it
looks like Unbox is about 40 percent more expensive than
Netflix:
> 3.99 / 2.81
[1] 1.419929
So the price of Unbox is not right, at least for me.
Testing Unbox-to-TiVo rentals
Because Amazon is offering free $15 credits to TiVo owners, I decided
to give Unbox a test drive. My test rental was The Illusionist. Renting the movie was
easy (just one click), and shortly thereafter Unbox automatically
downloaded the movie to my TiVo box. When I played the movie,
however, I was disappointed with the video quality. I easily
noticed banding artifacts, which were distracting
at times. On the whole, the viewing experience was inferior to watching a
DVD.
Netflix still beats Unbox
For me, then, Unbox is still a loser. It costs more and delivers
less than DVD rentals via Netflix.
A note to my friends at Amazon.com
I would be happy to give you my business, but right now you’re not
earning it. If you
want me as an Unbox customer, here is the recipe for winning me over:
- Let me easily download movie rentals to my TiVo. (Check.)
- Offer true DVD quality or better. (You’re not there yet.)
- Sell the rentals for less than $2.80. (You’re not there yet.)
Until then, I’ll have to give my money to Netflix.
Cheers,
Tom
Update: edits for clarity; added tags.
Posted in reviews
Tags amazon, dvds, movies, netflix, rentals, reviews, tivo, unbox
no comments
no trackbacks
