<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/stylesheets/rss.css" type="text/css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Tom Moertel's Weblog: Tag data</title>
    <link>http://blog.moertel.com/articles/tag/data?tag=data</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Quality rants on programming theory and stuff geeks like</description>
    <item>
      <title>Engauge Digitizer: a handy tool for extracting data from charts</title>
      <description>&lt;p&gt;Today I wanted to extract the data that were visualized in a
chart I saw on &lt;a href="http://www.blog.sethroberts.net/2007/04/14/omega-3-and-arithmetic-continued/"&gt;Seth Roberts&amp;#8217;s blog&lt;/a&gt;.  That is, I had a &lt;em&gt;picture&lt;/em&gt; of a data set, and I wanted the numbers behind the picture.&lt;/p&gt;


	&lt;p&gt;This task turned out to be surprisingly easy &amp;#8211; once I found &lt;a href="http://digitizer.sourceforge.net/"&gt;Engauge Digitizer&lt;/a&gt;, an open-source (GPL) tool made for this very task.  After I launched Engauge, the digitization process was straightforward:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;I established the chart&amp;#8217;s coordinate system by clicking in the corners and entering the associated coordinates.&lt;/li&gt;
		&lt;li&gt;Then I had Engauge identify data points.  With the mouse, I selected a data point by hand, teaching Engauge what a point looks like. Then Engauge identified spots on chart that looked like data points and locked on to them.  I was able to step through the points to tell Engauge to skip the few it misidentified.&lt;/li&gt;
		&lt;li&gt;I manually selected a few more data points that were scrunched into blobs and had eluded Engauge&amp;#8217;s point-detection heuristics.&lt;/li&gt;
		&lt;li&gt;Finally, I exported the data set in &lt;span class="caps"&gt;CSV&lt;/span&gt; format.&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;If you ever need to extract the data behind a chart, do check out
Engauge Digitizer.  (If you use &lt;a href="http://fedoraproject.org/"&gt;Fedora Linux&lt;/a&gt;,
you&amp;#8217;ll be happy to know that I have packaged Engauge for you.
Get it at the &lt;a href="http://community.moertel.com/ss/space/RPMs"&gt;RPMs section&lt;/a&gt; 
of the community site.)&lt;/p&gt;</description>
      <pubDate>Tue, 17 Apr 2007 03:45:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:f8b0d9b7-7322-4d32-bed8-6f5ded82940f</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2007/04/17/engauge-digitizer-a-handy-tool-for-extracting-data-from-charts</link>
      <category>statistics</category>
      <category>fedora</category>
      <category>statistics</category>
      <category>data</category>
      <category>charts</category>
      <category>plots</category>
      <category>rpms</category>
      <category>tools</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/441</trackback:ping>
    </item>
    <item>
      <title>The IMDB Movie Rating Decoder Ring: updated w/ 2 March 2007 data</title>
      <description>&lt;p&gt;If you want to get more out of &lt;a href="http://imdb.com/"&gt;&lt;span class="caps"&gt;IMDB&lt;/span&gt;&lt;/a&gt; movie ratings, check out my
&lt;a href="http://community.moertel.com/ss/space/IMDB+Movie-Rating+Decoder+Ring"&gt;&lt;span class="caps"&gt;IMDB&lt;/span&gt; Movie Rating Decoder Ring&lt;/a&gt;, now updated with fresher data (as of 2 March 2007).&lt;/p&gt;</description>
      <pubDate>Fri, 09 Mar 2007 17:40:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:f75cbc12-2c78-4a30-9863-968dc535d1a3</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2007/03/09/the-imdb-movie-rating-decoder-ring-updated-w-2-march-2007-data</link>
      <category>statistics</category>
      <category>imdb</category>
      <category>statistics</category>
      <category>movies</category>
      <category>decoder_rinng</category>
      <category>ratings</category>
      <category>stars</category>
      <category>data</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/409</trackback:ping>
    </item>
  </channel>
</rss>
