Quick tip: converting lines of text to paragraphs (and the reverse)

Posted by Tom Moertel Fri, 16 Jul 2010 14:56:00 GMT

Text documents come in two basic flavors: text editor and word processor. In the text-editor flavor, a document is represented by lines of text, each ending with a line break. Paragraphs are separated by two breaks. In the word-processor flavor, there are no line breaks, only paragraphs (because word processors will “wrap” lines of text as you edit them).

If you need to convert from text-editor flavor (lines) to word-processor flavor (paragraphs), here’s a handy Perl one-liner that will do it:

perl -lp00e's/\n/ /g' input.txt > output.txt

If you’re editing in Emacs, you can convert a selected region of text using the same one-liner via shell-command-on-region:

C-u M-| perl -lp00e's/\n/ /g' RET

To go the other way, from word-processor flavor to text-editor flavor, the Unix command-line tools fold(1) and fmt(1) do the job.

Tags , ,
3 comments
no trackbacks
Reddit Delicious

Comments

  1. praki said about 8 hours later:

    With emacs, there is no need to rely on perl. Just do Ctrl-Shift-Meta-% C-q C-j RET SPACE RET ! will do what you want.

  2. Tom Moertel said about 9 hours later:

    praki, the Perl incantation preserves paragraph-separating breaks (the -00 switch puts Perl into “paragraph slurping” mode); the Emacs version does not. Interestingly, if you use the Perl one-liner with shell-command-on-region by itself (M-|), it works as expected, but when invoked via C-u M-|, Emacs seems to strip the paragraph breaks before replacing the selection. I wonder explains the difference in behavior..

  3. Kevin M said 3 days later:

    I just the *nix tools, ‘fmt’ or ‘fold’ for casual formatting.

    For more precise handling, see ‘par’:

    http://www.nicemice.net/par/

Trackbacks

Use the following link to trackback from your own site:
http://blog.moertel.com/articles/trackback/1591

(leave url/email »)

   Comment Markup Help Preview comment