Quick tip: converting lines of text to paragraphs (and the reverse)

Posted by Tom Moertel Fri, 16 Jul 2010 14:56:00 GMT

Text documents come in two basic flavors: text editor and word processor. In the text-editor flavor, a document is represented by lines of text, each ending with a line break. Paragraphs are separated by two breaks. In the word-processor flavor, there are no line breaks, only paragraphs (because word processors will “wrap” lines of text as you edit them).

If you need to convert from text-editor flavor (lines) to word-processor flavor (paragraphs), here’s a handy Perl one-liner that will do it:

perl -lp00e's/\n/ /g' input.txt > output.txt

If you’re editing in Emacs, you can convert a selected region of text using the same one-liner via shell-command-on-region:

C-u M-| perl -lp00e's/\n/ /g' RET

To go the other way, from word-processor flavor to text-editor flavor, the Unix command-line tools fold(1) and fmt(1) do the job.

Tags , ,
3 comments
no trackbacks
Reddit Delicious

The best-kept secret in programming conferences, especially in a down economy

Posted by Tom Moertel Fri, 22 May 2009 05:59:00 GMT

I know, the economy sucks, and everything is expensive these days. It’s even worse for you, a polyglot programmer with a serious programming-language obsession. You prowl Proggit, lounge at LtU, and occasionally step on over to Stack Overflow. But it’s just not enough. You need more. You need to hang out in meatspace with other fascinating programmers, diving into modern object systems, getting mechanical with crazy VMs, hacking on code like the wild code-hacking beast that you are.

Sure, it’s a nice dream and all, but how are you going to make it happen? And even if you could in theory make it happen, how could you afford to do it now, in this down economy?

Well, my friend, let me share a secret: You can make it happen. And you can afford it. Here’s how: Just be at the 10th Anniversary Yet Another Perl Conference. It’s day upon day upon day of jam-packed programming-language goodness of all sorts, not “just” Perl – and this year it’s the one conference you can afford.

Seriously, I did a little price-checking, and YAPC is about the most underpriced programming-fest on the planet:

Conference Price
JavaOne $1,995
RailsConf 895
PyCon 450
RubyConf 200
YAPC 125


Wait, you’re not into Perl? No problem. The Perl community has always embraced diversity, and there’s a lot more than just Perl at YAPC. Check out the tag cloud for talks and you’ll see what I’m saying. At YAPC, the good stuff comes in enormous buckets, plenty for programming aficionados of all stripes. Here’s a taste:

See, YAPC is for you.

Am I trying to persuade you to join us at YAPC? Yes. But I’m only doing it because I care about you. YAPC is a fascinating conference, packed with hackers from around the world, all eager to share interesting things, things many you would find delightful, if only you knew about them. So I’m letting you know about them, right now, so you don’t miss out.

Do yourself a favor. If you can figure out how to get your brain to Pittsburgh in the 4th week of June 2009 – yes, only 4 weeks away – then by all means register now for YAPC|10. It’s a great conference at a great price, and it’s something no discriminating hacker ought to be denied.

I hope to see you at YAPC|10.

Update: If any Haskellers are reading this and want to meet up at YAPC, let me know. I’m trying to put together a BOF session.

Posted in
Tags , , , , , , , ,
14 comments
no trackbacks
Reddit Delicious

Perl saved my vacation!

Posted by Tom Moertel Sat, 25 Apr 2009 14:28:00 GMT

My wife and I are on vacation. We spent yesterday at Longwood Gardens in eastern Pennsylvania. It’s beautiful: words don’t do it justice – you really do need to see the photos. That’s why I took about 500 photos yesterday. At least, that’s what I thought until I got back to our bed & breakfast and tried to download the photos from my camera to my laptop.

Crap! About a quarter of the photos were missing! I wasn’t sure what had happened, but I suspect that my budget 16-GB SD card had started throwing bad blocks. There go my priceless vacation memories, right down the technological toilet.

But maybe those photos weren’t irretrievably flushed. Maybe the data behind most of them was still on the SD card, if only I could get at it. Hey, I’ve got Perl on my laptop: I can get at that data.

Here’s how it went down.

First, I scanned the raw blocks of the suspected-faulty SD card, looking for the markers that indicate the start of a JPEG file. My laptop runs Linux, so it was easy to access the blocks. I just read the device file corresponding to the SD card’s filesystem. I used a Perl script to walk through the file in block-sized steps, hunting JPEG headers. Here’s the meat of the script:

while (my $bytes_read = sysread($fh, $buffer, $READ_BLOCKS*$BLOCK_SIZE)) {
    for (my $offset = 0; $offset < $bytes_read; $offset += $BLOCK_SIZE) {
        my $tag = substr($buffer, $offset + 6, 4);
        if (grep $tag eq $_, qw(JFIF Exif)) {
            print $pos + $offset/$BLOCK_SIZE, "\n";  # emit "interesting" block
        }
    }
    $pos += $bytes_read/$BLOCK_SIZE;
}

I handled the second half of the rescue mission with another Perl script. This script grabbed the data starting at each interesting block, as determined by the first script, and tried to decode the data losslessly as a JPEG file, writing the result to a new file on my laptop’s hard drive. Here are the tasty bits of that script:

$SIG{PIPE} = 'IGNORE';  # pipe will break on damaged images

while (my $target = <>) {

    # seek forward until we hit the desired block
    while ($pos != $target) {
        my $diff = $target - $pos;
        $diff = $MAX_SEEK_BLOCKS if $diff > $MAX_SEEK_BLOCKS;
        sysseek($fh, $diff * $BLOCK_SIZE, 1);
        $pos += $diff;
    }

    # read the data starting at that block, attempting to decode as JPEG
    if (my $bytes_read = sysread($fh, $buffer, $BLOCK_SIZE*$DATA_READ_SIZE)) {
        my $outfile = sprintf("%s/%010d.jpg", $outdir, $target);
        open(my $pipe, "|jpegtran -copy all -outfile $outfile")
            or die "can't open pipe: $!";
        print $pipe $buffer;
        close($pipe);
        $pos += $DATA_READ_SIZE;
    }
}

With these two scripts, I was able to retrieve almost all of the lost photos. That’s 120 more priceless memories that I can post to Flickr to annoy my friends. Yay, Perl!

(BTW, if you want to see some of the rescued photos, see my Longwood Gardens, Spring 2009 photo set on Flickr.)

Posted in
Tags , ,
14 comments
no trackbacks
Reddit Delicious

Perlfolk! Get your YAPC|10 talk proposals in now!

Posted by Tom Moertel Tue, 21 Apr 2009 03:03:00 GMT

Hey! All you Perl hackers out there, don’t forget to submit your talk proposals for the 10th Anniversary Yet Another Perl Conference.

Wait, you don’t know about this great opportunity to share cool Perl stuff with your peers? Then, by all means, read all about it. That’s right, you don’t want to miss the chance to give a talk at the big 10th Anniversary YAPC.

So submit a talk or two. But do it now. The deadline is approaching fast.

Seriously, why not submit a talk right now? Don’t put it off: seize the day.

Your pal,
Tom

Posted in
Tags , , ,
no comments
no trackbacks
Reddit Delicious

10th-Anniversary YAPC coming right up!

Posted by Tom Moertel Wed, 25 Feb 2009 02:03:00 GMT

Just a quick note to all the wonderful Perlfolk who are eagerly awaiting news of YAPC|10. (That’s the 10th-anniversary Yet Another Perl Conference, to be held June 22-24, 2009, in Pittsburgh, Pennsylvania, where it all started back in 1999.) Ahem:

Your beloved organizers have been hard at work and will be making some announcements shortly. Stay tuned to yapc10 on twitter for the latest and greatest.

Until then, don’t worry: be cool. When it comes to conference planning, we’re a little less conversation, a little more action. That’s just how we roll.

Hugs and kisses,
Tom

Tag: YAPC::NA::2009

Posted in
Tags , , , ,
no comments
no trackbacks
Reddit Delicious

See you at the Pittsburgh Perl Workshop 2008!

Posted by Tom Moertel Thu, 09 Oct 2008 01:04:00 GMT

The 2008 Pittsburgh Perl Workshop is this weekend! I can’t wait. (BTW, there are still seats available. If you can somehow get yourself to Pittsburgh this weekend, by all means, grab a PPW ticket now.)

I’m on the organizing committee, so I get an advance look at the talks, and I’m continually impressed by the quantity and sheer interestingness of the things that the Perl community has to say. When leading members of a community volunteer their time to talk to you about something they’re passionate about, that something is usually fascinating.

This year is no exception. There are tons of talks I want to see. Check out the schedule for Saturday and Sunday, and you’ll see what I mean. (You’ll note that there are even talks on programming GPUs and adorable BUG embedded hardware.)

In addition to technical talks, there are three courses being offered this year. Daniel Klein is once again leading his From Zero To Perl introductory course, which was widely praised at last year’s PPW. Author and Perl trainer Peter Scott is offering Maintaining Code While Staying Sane, which is all about maintaining legacy code, something most programmers must do for a (surprisingly) large chunk of their careers. Finally, the ever-knowledgeable brian d foy is offering his Mastering Perl course for coders interested in learning how to reliably write professional, enterprise-quality Perl programs. (I think there are openings for some of the classes, too. If you’re interested, click one of the links above and try to grab a spot.)

This year we’re expanding on the Hackathons, too. We actually have allocated a dedicated “Hackathon Room” – and we’ve arranged for freshly ground, freshly brewed coffee all day long to fuel the hacking. :-)

All in all, it’s shaping up to be another fun-filled, festive PPW. I hope to see you there!

Posted in
Tags , ,
no comments
no trackbacks
Reddit Delicious

Perl helps prove universality of 2, 3 Turing machine

Posted by Tom Moertel Fri, 26 Oct 2007 17:23:00 GMT

Alex Smith, a 20-year-old EE student in the UK, proved that the 2, 3 Turing machine is universal. In doing so, he was able to claim the $25,000 prize that Stephen Wolfram offered for the first proof (or disproof) of the 2, 3 machine’s universality.

This story has been getting a lot of attention lately, but one part of the story has not: that the Perl programming language is featured in the proof. In his documentation of the proof, Universality of Wolfram’s 2, 3 Turing Machine, Smith wrote, “I have written several Perl programs, to demonstrate the constructions given in the proof and to interpret the systems given in various conjectures.” Smith’s proof includes no fewer than 7 Perl programs.

Go Perl!

Posted in
Tags , ,
1 comment
no trackbacks
Reddit Delicious

PPW 2007: a twenty-ton can of programming whoop-ass

Posted by Tom Moertel Tue, 25 Sep 2007 23:04:00 GMT

I am on the planning committee for the Pittsburgh Perl Workshop. So far, it’s been an interesting ride. Last year was the first PPW, and it went surprisingly well. In the post-conference surveys, 94 percent of respondents said they wanted to come back for another PPW, so we committed ourselves to repeating the grueling conference-planning process for 2007. (Fact: making big commitments like this is much more likely to happen if you’re drinking beer at the time.)

Now a year has gone by, and PPW 2007 is only three weeks away. This year’s conference is 100% larger – two full days – and offers a new, much-asked-for option: a one-day introductory course to give programmers new to Perl a quick dose of the language so they can dive into the rest of the conference. This year’s conference also offers a full-length Hackathon for those who feel the urge to code at the conference.

The main attraction, however, is the conference’s wide array of technical talks. We have retained the same mix of industry and academic speakers that attendees said they liked so much last year. Indeed, our speaker list includes some of last year’s most fascinating speakers, as well as many new speakers drawn from the world of Perl. No matter what your interests are, you’ll find talks for you at PPW 2007. (I’m particularly interested in the talks on continuation-based web applications, the cool new stuff in Perl 5.10, and the Moose object system.)

All of this is to say: Do not miss PPW 2007! Where else are you going to find so many interesting people, so many fascinating talks, and so many opportunities to have fun and make friends while learning useful stuff, all for so little expense? (Regular admission is only $70, and students get a big discount.) Get your ticket now because over half of the seats are already gone.

I hope to see you there.

Posted in
Tags , , , ,
no comments
no trackbacks
Reddit Delicious

Pittsburgh Perl Workshop 2007: Don't miss your chance to speak!

Posted by Tom Moertel Mon, 20 Aug 2007 23:18:00 GMT

This year’s Pittsburgh Perl Workshop is shaping up to be uber-techno-awesome. This year, it’s two big days of lively technical talks and full-force Perl festiveness. Yes, come October, programmers of all stripes will gather in Pittsburgh over the weekend of the 13th to grab a slice of the fun. A big slice. And you – yes you, my friend – should be there.

Lots of interesting talks are flowing in, but it’s not too late to grab a speaking slot. If you have anything interesting to say about Perl, now is your time. 20- and 50-minute slots are available. To claim one, just go to pghpw.org and submit a talk proposal. It’s easy. But act now, before it’s too late!

If you have any interest in Perl, you’ll want to be at PPW 2007, and if you have anything to say about Perl, you’ll definitely want to speak at PPW 2007.

Don’t miss your opportunity. Seize the day!

Posted in
Tags , , ,
no comments
no trackbacks
Reddit Delicious

Talk: Fun with Numbers: R and Perl (and IMDB data)

Posted by Tom Moertel Thu, 21 Jun 2007 18:38:00 GMT

Last week I gave a talk on the R statistics system and Perl for the Pittsburgh Perl Mongers. The example that threaded through the talk was something I have written about here before, extracting useful information from the Internet Movie Database. If you’ve read my earlier blog post or have used the Grand Unified IMDB Movie Rating Decoder Ring, you might find the slides from the talk interesting. They provide some more details about the R and Perl code used to analyze the IMDB data and create the decoder ring.

You can get the slides here:

Title slide from my talk on R and Perl

Posted in
Tags , , , ,
2 comments
no trackbacks
Reddit Delicious

Older posts: 1 2