Perl helps prove universality of 2, 3 Turing machine

Posted by Tom Moertel Fri, 26 Oct 2007 17:23:00 GMT

Alex Smith, a 20-year-old EE student in the UK, proved that the 2, 3 Turing machine is universal. In doing so, he was able to claim the $25,000 prize that Stephen Wolfram offered for the first proof (or disproof) of the 2, 3 machine’s universality.

This story has been getting a lot of attention lately, but one part of the story has not: that the Perl programming language is featured in the proof. In his documentation of the proof, Universality of Wolfram’s 2, 3 Turing Machine, Smith wrote, “I have written several Perl programs, to demonstrate the constructions given in the proof and to interpret the systems given in various conjectures.” Smith’s proof includes no fewer than 7 Perl programs.

Go Perl!

Posted in
Tags , ,
1 comment
no trackbacks
Reddit Delicious

Baker's percentages and how not to explain them

Posted by Tom Moertel Sat, 16 Sep 2006 06:11:00 GMT

I like to bake, and I work in a professional kitchen from time to time, so I picked up The Baker’s Manual, 5th ed., hoping to carry it in my kitchen bag as a quick reference for large-scale recipes.

Before going further, you need to know two things about professional bakers. First, they measure dry ingredients not by volume, the way home bakers do, but by weight, which is both faster and more precise for the large quantities frequently used in professional kitchens. Second, when the pros write bread recipes, they express quantities in relative terms called “baker’s percentages.” Each ingredient’s quantity is given as a percentage of the recipe’s total flour weight. For example, the book provides the following recipe, referred to as a “formula,” in the section on baker’s percentages:

80.0%bread flour
20.0%whole wheat flour
66.0%water
2.0%salt
1.2%yeast

As you would expect, the percentages for bread flour and whole wheat flour add to 100 percent.

Now, here’s where the book goes down in flames. It attempts to explain how baker’s percentages let you easily scale recipes to any desired batch size, but it fails. Utterly. Here’s the book’s explanation for how to scale the above recipe to 300 pounds:

[T]o calculate the weight of each ingredient in the [300-pound] recipe, you add up all of the percentages in the above formula. This total percentage value is 169.2. Divide this number by the desired dough weight, 300 pounds, to get .564. Round this number up to get .6. Then multiply the percentage amount for each ingredient in the above recipe by .6 to obtain the larger weight required by the larger recipe. (Emphasis mine.)

When I read that explanation, I thought, Multiply? That’s the exact opposite of what you ought to do. And, sure enough, the book went on to prove its own explanation completely wrong:

80% bread flour * .6= 48 pounds
20% whole wheat flour * .6= 12 pounds
66% water * .6= 39.6 pounds
2% salt * .6= 1.2 pounds
1.2% yeast * .6= .7 pound

Note: the above is quoted verbatim from the book.

Does the “scaled-up” recipe yield 300 pounds? Nope. Add up the resulting weights and you get 101.5 pounds. Oops.

Is it really that hard to see that the correct method is simply to multiply each percentage by desired batch size and then divide by the sum of percentages? In the case of the book’s 300-pound example, we would multiply each percentage in the recipe by the following factor:

300 pounds / 169 percent = 177.5 pounds

Let’s try it out:

80% bread flour * 177.5 pounds= 142 pounds
20% whole wheat flour * 177.5 pounds= 35.5 pounds
66% water * 177.5 pounds= 117.2 pounds
2% salt * 177.5 pounds= 3.55 pounds
1.2% yeast * 177.5 pounds= 2.13 pounds

Now if you add up the resulting weights, you get the desired total of 300 pounds.

That the book not only gets the scaling method completely backward but then goes on to prove itself wrong is amazing. Didn’t anybody at John Wiley & Sons proofread the math?

Not exactly a confidence-builder for the rest of the book.

Posted in , ,
Tags , ,
5 comments
no trackbacks
Reddit Delicious

Open-source statistics: R and ESS

Posted by Tom Moertel Fri, 27 Aug 2004 16:00:00 GMT

Recently, I needed to perform some statistical work. But I didn’t want use my previous tool-of-choice, Mathematica, because I decided after my switch to Linux not to rely on proprietary software when viable open-source alternatives existed. And thus I embarked on a short search for open-source statistics software.

R

My search was fruitful, leading me immediately to the delightfully GPL-licensed R Project for Statistical Computing: “R is a language and environment for statistical computing and graphics.” (The R system and language are similar to S, developed at Bell Labs.) The R language has functional-programming semantics (which I love) and supports (among others) the object-oriented style of programming, which is used extensively for R’s statistical interface. Most results in R are delivered in terms of objects, such as tables and and vectors and linear models, whose properties you can inspect and manipulate as you would expect. The underlying classes provide specialized methods for common operations so that the objects do the right things in response to generic commands.

Immediately, I was hooked on R. Despite having a sharp initial learning curve, R is straightforward to use. Once you get the lay of the land, you can reliably guess what functions and their arguments mean. The help facility is good, too, and can integrate with your web browser if you desire.

And the graphics! Graphs and charts are often the first, best way to size up data sets. R makes it easy to create publication-quality graphs and charts, drawing on any number of supported “graphical devices.” Among the stock devices are postscript, pdf, LaTeX, png, xfig, postscript-rendered bitmaps, and X11 (windows). For a tiny example of R’s graphics, see my posts on Mining gold from the Internet Movie Database.

To make the already-attractive R downright irresistible, the R community offers the Comprehensive R Archive Network (CRAN), the R equivalent of Perl’s CPAN. (One of the CRAN mirrors is hosted by Pittsburgh’s own pair networks.) CRAN provides packages for esoteric methods of analysis, database integration, genetics, time series analysis, HTTP (!), map projections, vegetation science, and myriad others. Additionally, CRAN provides numerous sample data sets, many corresponding to examples and problem sets from popular statistics textbooks. (I should note that R, out of the box, comes loaded with tools and sample data. CRAN isn’t in any way remedial but rather expands R’s initial richness to mind-blowing proportions.)

ESS

Once I started to use R frequently, I grew tired of the command-line interface. That’s where Emacs Speaks Statistics (ESS) comes in. It’s an add-on to Emacs that provides a seamless, rich interface to R (and other statistics packages). Since I live in Emacs, ESS was a natural fit for my working style. Highly recommended. (If you’re interested, I have made a Fedora/RedHat RPM package for ESS. Get it in the RPMs section of the site.)

Summary

If you’re looking for a good statistics system, get R. Now. And if you use Emacs, too, by all means get ESS. (If you just need a few bare-bones tools, however, you might want to check out my tiny statistics tools in Tom’s Perl code on the Community Projects site.)

Posted in
Tags , , , , ,
1 comment
no trackbacks
Reddit Delicious