Posted by Tom Moertel
Tue, 31 Jan 2006 02:57:00 GMT
I love espresso. It’s my favorite way to enjoy coffee. Even so, I
almost never order espresso in coffee shops because, here in the
United States, very few coffee shops have mastered the exacting
process by which espresso is made. Dr. Josuma John of the Josuma
Coffee Company writes that
“more than 95 percent of North American espresso is poorly made, and,
in fact, undrinkable.” My experience with Pittsburgh-area coffee
shops in the last decade provides no evidence to refute Dr. John’s
claim.
If espresso in the United States is so bad, why do Americans drink
enough of it to support a Starbucks on every street corner? The reason
is that Americans drink espresso almost exclusively in the form of
milk-based beverages: cappuccinos, lattes, and mochas. Milk and
flavored syrups are the main attractions. Espresso serves only as a
coffee-flavored backdrop in which bitterness, a characteristic
of poorly made espresso, complements the abundant sweetness of milk
laced with sugar syrups. American coffee-shop owners thus have little
incentive to offer better espresso to their customers – bad espresso
is good enough.
Because of this sad reality, I have developed through hard experience
the following reliable guideline for ordering espresso at American
coffee shops: Don’t. The one exception I make is for new coffee
shops, at which I will try a double espresso, just to see what I get.
Almost always, I get a bad espresso, bitter and watery.
And that is what I had expected back in April 2005, when I spotted the
brand-new sign for Aldo Coffee Co. in my
home town of Mt. Lebanon, Pennsylvania, located in Pittsburgh’s South
Hills. I went in, dragging my wife along, and placed my order.
Then something unusual happened. The barista asked me, somewhat
hopefully it seemed, if I drank espresso regularly. When I said yes, she seemed pleased. When she followed up by
asking me if I read
alt.coffee, I was stunned.
When I observed that she was timing my shot, my brain actually shut
down for a few seconds while it forcibly recalibrated itself to
accommodate the seemingly impossible: that I was standing in a
coffee shop in my home town, conversing with a barista about
alt.coffee, and mere seconds away from receiving what was very likely
to be good espresso.
Read more...
Posted in good stuff, espresso, pittsburgh
Tags aldo, coffee, espresso, mtlebo
9 comments
1 trackback

Posted by Tom Moertel
Tue, 24 Jan 2006 03:39:00 GMT
When I let the dog out this evening, it didn’t take long for her to start barking. Figuring she had cornered the neighbor’s cat, I went outside and called her. Naturally, she ignored my order to come back into the house.
Angrily, I marched up to her, underneath the crabapple tree, and took her by the collar. I made sure to bend low and look her in the eyes, just to let her know that I was not happy about having to walk in the wet grass to fetch her. When I stood up to lead her back to the house, my head reached into the lower branches of the crabapple tree.
And then I saw it, inches from my face, looking right back at me.
Read more...
Posted in photography, interesting stuff
Tags animals, opossums
5 comments
no trackbacks

Posted by Tom Moertel
Sat, 21 Jan 2006 21:04:00 GMT
As you may know, a few months ago I moved my blog from its old system, powered by SnipSnap over to a new system, powered by the delightful and easier-to-hack-on Typo. Now that everything has been running comfortably for a few months, I am going to move some of my old blog’s content over to Typo.
At first I planned on writing a program to handle the move for me. It would pull from SnipSnap’s database, convert the markup of the articles, and drop the results into Typo’s database. After reviewing my old content, however, I have changed my plans.
My new plan is to cherry-pick the most interesting stuff and move it over. Some of the old stuff is too out of date or too tied to SnipSnap’s integrated wiki to be sensibly extracted and integrated into the new blog.
I’m starting the move today. If you see a “new” article that has an old date, you’ll know why.
Cheers,
Tom
Posted in site news
no comments
no trackbacks

Posted by Tom Moertel
Fri, 20 Jan 2006 23:02:00 GMT
Every so often, I am going to write about
wondrous oddities – obscure programming-language features
that are so cool they deserve wider notice.
Today, in the first installment, I want to show you the function-call
semantics of R, a great system
for statistical computing.
You might not expect a statistics system to have a first-class
programming language at it’s heart, but if you think about it, it does
make sense. The R language, actually a dialect of the S language, is
described as “a well-developed, simple and effective programming
language which includes conditionals, loops, user-defined recursive
functions and input and output facilities.” All true. It gives me
the feeling of an infix Lisp or Scheme whose syntax is slanted toward
mathematics and vector operations. The language has an object layer,
too, but that’s not why we are here.
No, we are here to look at R’s uncommonly interesting function-call semantics, in particular
argument binding and evaluation. Let’s dig in.
Read more...
Posted in programming languages, statistics, wondrous oddities
Tags languages, R, semantics
5 comments
no trackbacks

Posted by Tom Moertel
Wed, 18 Jan 2006 01:59:00 GMT
The Internet Movie Database (IMDb) is a rich source
of online movie information. The problem is, the true gold is buried
deep beneath the site’s user-friendly exterior and hidden within the
database itself. With a little digging, however, we can extract the
gold, nugget by nugget, and learn about fun statistical tools for data
analysis.
Today, in the first part of our analysis, we will put our intuition
about rating systems to the test. We will decode IMDb “user ratings,”
those numbers such as 6.1 and 7.8 that summarize how the registered
users of the IMDb rated movies on a scale from 1 to 10, typically
depicted as a series of stars on the screen:
We will extract the collective wisdom of registered IMDb users in
order to convert a movie’s user rating into the movie’s standing
within the database. This gives us a good indicator of how the movie
stacks up against other movies in general, and that’s good information
to have when deciding which movies to see in the theater or add to
your Netflix list.
Ready to start digging? Let’s go!
Read more...
Posted in movies, statistics
Tags imdb, movies, R, statistics
9 comments
no trackbacks

Posted by Tom Moertel
Mon, 16 Jan 2006 06:34:00 GMT
I noticed that my site has been picking up more comment spam recently.
Typo has built-in spam protection, but for
some reason a few spam comments that ought to have been caught slipped
through its filters. Curious, I investigated.
Most spam comments contain links to sites favored by the spammers.
The sites are almost always of the form x.domain.com,
where domain is one of a few higher-level domains and x is drawn
from a large set of values from the realms of gambling, pornography,
and male enhancement. It seems that the spammers pay for a few real
domains and then create a ton of subdomains under them.
One of the ways to detect comment spam is to find URIs in comments and
look up the sites they point to in DNS-based
SURBLs,
such as multi.surbl.org and
bsb.empty.us. The thing is, when SURBLs list a
spammy site x.domain.com, sometimes they list it under the full
hostname x.domain.com and sometimes they list it
under the higher-level domain
domain.com. To be safe, Typo looks up both forms when it checks
for spam.
Here’s the code it uses:
HOST_RBLS.each do |rbl|
begin
if [
IPSocket.getaddress([host, rbl].join('.')),
IPSocket.getaddress((domain + [rbl]).join('.'))
].include?("127.0.0.2")
throw :hit, "#{rbl} positively resolved #{domain.join('.')}"
end
rescue SocketError
end
end
The code iterates over the list of SURBLs it has and queries each
twice – once for the host and once for the domain in question – saving
the results of the queries in an array. Then if the array includes a
positive response (127.0.0.2), it throws a “hit” notice to the
calling code, which will block the associated comment.
Unfortunately, the code doesn’t quite work as intended. Although a
positive response for either the host or the domain should register
as a hit, the code requires both queries to return positive
responses. As a result, the code yields a lot of false negatives
because most lists don’t include both host and domain forms of spammy
sites; the required double positive is thus hard to obtain.
The cause of the problem is the attempt to query for both forms of the
site before checking either response. The queries are performed by
calling IPSocket.getaddress, which performs a DNS query
for the “A” record associated with its argument. If the record
exists, the call returns it; otherwise, the call raises a
SocketError exception.
The exception is what causes the logic to break down. When either the
host or domain is not in the queried SURBL, which will almost always
be the case for reasons I explained earlier, one of the queries will
result in a SocketError exception. The exception will be
caught by the rescue clause later in the code, but not
before the opportunity to test the other query’s response and throw a
“hit” has been lost.
My fix was to replace the above code with a call to a new helper
method:
query_rbls(HOST_RBLS, host, domain.join('.'))
The helper, defined later, makes the actual queries:
def query_rbls(rbls, *subdomains)
rbls.each do |rbl|
subdomains.uniq.each do |d|
begin
response = IPSocket.getaddress([d, rbl].join('.'))
throw :hit, "#{rbl} positively resolved #{d} => #{response}"
rescue SocketError
end
end
end
return false
end
Because some SURBLs don’t use 127.0.0.2 but some other “A” record to
indicate a positive response, my helper removes the hard-coded address
test.
I also made a few more improvements to the spam-protection
code. The full set of changes is available as Patch
657 on the Typo Trac site.
Posted in typo
Tags ruby, spam, typo
no comments
no trackbacks
