<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/stylesheets/rss.css" type="text/css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Tom Moertel's Weblog: Category wondrous oddities</title>
    <link>http://blog.moertel.com/articles/category/wondrous-oddities</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Quality rants on programming theory and stuff geeks like</description>
    <item>
      <title>Wondrous oddities: R's function-call semantics</title>
      <description>&lt;p&gt;Every so often, I am going to write about
&lt;em&gt;wondrous oddities&lt;/em&gt; &amp;#8211; obscure programming-language features
that are so cool they deserve wider notice.
Today, in the first installment, I want to show you the function-call
semantics of &lt;a href="http://www.r-project.org/about.html"&gt;R&lt;/a&gt;, a great system
for statistical computing.&lt;/p&gt;


	&lt;p&gt;You might not expect a statistics system to have a first-class
programming language at it&amp;#8217;s heart, but if you think about it, it does
make sense.  The R language, actually a dialect of the S language, is
described as &amp;#8220;a well-developed, simple and effective programming
language which includes conditionals, loops, user-defined recursive
functions and input and output facilities.&amp;#8221;  All true.  It gives me
the feeling of an infix Lisp or Scheme whose syntax is slanted toward
mathematics and vector operations.  The language has an object layer,
too, but that&amp;#8217;s not why we are here.&lt;/p&gt;


	&lt;p&gt;No, we are here to look at R&amp;#8217;s uncommonly interesting function-call semantics, in particular
argument binding and evaluation.  Let&amp;#8217;s dig in.&lt;/p&gt;&lt;h3&gt;Flexible argument binding&lt;/h3&gt;


	&lt;p&gt;Here is a simple function of two arguments:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f &amp;lt;- function(tens, ones = tens)
    ones + 10 * tens
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The function &lt;em&gt;f&lt;/em&gt; has two formal arguments, &lt;em&gt;tens&lt;/em&gt; and &lt;em&gt;ones&lt;/em&gt;, the
second of which has a default value, defined to be &lt;em&gt;tens&lt;/em&gt;, referring
back to the first argument.  R lets you call the function like so,
passing in arguments by position:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f(3, 4)  # 34
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;But you can also specify arguments by name, in any order:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f(tens=3, ones=4)  # 34
f(ones=4, tens=5)  # 54
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;And, if you leave off the &lt;em&gt;ones&lt;/em&gt; argument, it will get its
value from &lt;em&gt;tens&lt;/em&gt; because of its default definition:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f(3)       # 33
f(tens=2)  # 22
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Up to this point, you&amp;#8217;re probably thinking that this is nice and all,
but not &amp;#8220;wondrous oddity&amp;#8221; material.   Hold that thought for a moment.&lt;/p&gt;


	&lt;p&gt;Moving on, you can mix positional and named arguments and even shuffle the
argument ordering:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f(tens=2, 6)       # 26
f(6, tens=2)       # 26
f(ones=9, tens=8)  # 89
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;You can even abbreviate arguments:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;f(tens=2, o=6)  # 26
f(t=3, ones=9)  # 39
f(o=9, t=4)     # 49
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;To explore the full abbreviation semantics, we need a more complex
function:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;g &amp;lt;- function(ones=1, tens=2, hundreds=3, thousands=4)
    ones + 10 * tens + 100 * hundreds + 1000 * thousands
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;You can call the function with no arguments, as expected:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;g()  # 4321
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;But you can&amp;#8217;t get away with an ambiguous argument abbreviation:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;g(t=0) # Error in g(t = 0) :
       # argument 1 matches multiple formal arguments
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;So you must disambiguate:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;g(te=0) # 4301
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;But, R is smart enough not to consider an abbreviation ambiguous if
the ambiguity goes away when other arguments are matched exactly:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;g(t=0, thousands=9) # 9301
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Before we move on, let&amp;#8217;s review R&amp;#8217;s argument-binding features:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;you can pass arguments by position or by name&lt;/li&gt;
		&lt;li&gt;you can omit arguments that have defaults&lt;/li&gt;
		&lt;li&gt;you can abbreviate argument names&lt;/li&gt;
		&lt;li&gt;you can use any combination of the above features, provided
  the combination results in no ambiguity&lt;/li&gt;
	&lt;/ul&gt;


	&lt;h3&gt;Lazy argument evaluation&lt;/h3&gt;


	&lt;p&gt;Unlike most programming languages, R evaluates bound arguments lazily,
meaning that the expressions you pass as arguments are not converted
into values until are they needed.  This lets you create functions
that act like &lt;a href="http://foldoc.org/foldoc.cgi?query=control+structure&amp;#38;action=Search"&gt;control
structures&lt;/a&gt;.
For example, the following function acts like an if-then-else control
structure:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;myif &amp;lt;- function(test, valT, valF)
    if (test) valT else valF

myif(T, print("true"), print("false"))  # prints "true" 
myif(F, print("true"), print("false"))  # prints "false" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Even though the &lt;em&gt;valT&lt;/em&gt; and &lt;em&gt;valF&lt;/em&gt; arguments are print statements,
they are not evaluated until they are chosen by the test argument.
The unchosen argument is not evaluated at all.&lt;/p&gt;


	&lt;p&gt;In contrast, most common languages evaluate arguments before passing
them into functions.  For example, Ruby:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;# Ruby code

def myif(test, valT, valF)
  if (test) then valT else valF; end
end

myif(true, puts("true"), puts("false"))
# prints true *and* false
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Another benefit of R&amp;#8217;s lazy argument evaluation is that you can
provide mutually recursive defaults, which is a great way to implement
adaptive interfaces.  For example, here is a function that computes a
coordinate&amp;#8217;s representation in both &lt;a href="http://mathworld.wolfram.com/CartesianCoordinates.html"&gt;Cartesian&lt;/a&gt; and &lt;a href="http://mathworld.wolfram.com/PolarCoordinates.html"&gt;polar coordinate&lt;/a&gt;
systems.  You can specify the input coordinate in either system, and
the function adapts automatically:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;# R code

polar &amp;lt;- function(x = r * cos(theta), y = r * sin(theta),
                  r = sqrt(x*x + y*y), theta = atan2(y, x))
    c(x, y, r, theta)

polar(1,1)                    # provide (x,y) pair
# 1.0000000 1.0000000 1.4142136 0.7853982

polar(r=sqrt(2), theta=pi/4)  # provide (r, theta) pair
# 1.0000000 1.0000000 1.4142136 0.7853982
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Notice how there was no need for me to test the arguments to see how
the function was called.  All I did was define each set of argument
defaults in terms of the other set of arguments.  R can figure out the
rest based on how the function is called.  That&amp;#8217;s programmer friendly.&lt;/p&gt;


	&lt;p&gt;Let&amp;#8217;s review.  R&amp;#8217;s lazy argument evaluation provides cool benefits:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;you can define your own control structures&lt;/li&gt;
		&lt;li&gt;you can provide mutually recursive defaults for arguments, which makes
  smart, flexible interfaces easy&lt;/li&gt;
		&lt;li&gt;if you don&amp;#8217;t use an argument, you don&amp;#8217;t have to pay for R to evaluate it&lt;/li&gt;
	&lt;/ul&gt;


	&lt;h3&gt;Split-horizon scoping&lt;/h3&gt;


	&lt;p&gt;R&amp;#8217;s scoping rules give  passed arguments and
default values different perspectives &amp;#8211; split horizons, if you
will.  Passed arguments see what was visible at the time of the call.
No biggie here; every language works this way.  Default values, on the
other hand, see what is inside of the function as it evaluates.  That
means defaults have access to bound arguments &lt;em&gt;and local variables&lt;/em&gt;,
which means you can write functions whose defaults rely upon values
computed &lt;em&gt;in the function body&lt;/em&gt;.&lt;/p&gt;


	&lt;p&gt;This is a great feature that combines with R&amp;#8217;s lazy argument binding
to eliminate argument-handling logic.  For example, a lot of R&amp;#8217;s
library code takes advantage of the following idiom:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;myplot &amp;lt;- function(vals, ymin=bnds$ymin, ymax=bnds$ymax) {
    bnds &amp;lt;- compute.bounds(vals)
    # plot the values, constrained by ymin and ymax ...
}
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The &lt;em&gt;myplot&lt;/em&gt; function plots the values you pass it in &lt;em&gt;vals&lt;/em&gt;.  By
default the function scales the plot to show all of the values.  If
you want, however, you can constrain the vertical extent of the plot
by passing in &lt;em&gt;ymin&lt;/em&gt; and/or &lt;em&gt;ymax&lt;/em&gt; arguments.  Note the refreshing
lack of logic to handle the arguments.  The code just gets down to business.&lt;/p&gt;


	&lt;p&gt;For comparison, here is a Ruby version of the function.  When it comes
to this kind of thing, Ruby is better than most mainstream languages,
but it still makes us do about twice the work that R does:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;def myplot(vals, ymin = nil, ymax = nil)
  bnds = compute_bounds(vals)
  ymin ||= bnds.ymin
  ymax ||= bnds.ymax
  # plot the values, constrained by ymin and ymax ...
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;To recap, R&amp;#8217;s scoping rules, when combined with lazy argument
evaluation, let you shave away tedious argument tests and placeholder
defaults such as &lt;em&gt;nil&lt;/em&gt;.  Instead, you can focus on the core logic,
letting R take care of the argument handling burdens. The win might seem
small, but when you write a lot of code, the clarity and code
reduction add up.&lt;/p&gt;


	&lt;h3&gt; That&amp;#8217;s it&lt;/h3&gt;


	&lt;p&gt;So there you have it: a surprisingly sophisticated function-call
semantics that does away with argument-handling tedium.  That
you&amp;#8217;ll find it in a statistics system and not in a mainstream
programming language makes it a &lt;em&gt;wondrous oddity&lt;/em&gt;.&lt;/p&gt;</description>
      <pubDate>Fri, 20 Jan 2006 18:02:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:7b85950a83a444d1317fae802523c404</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/01/20/wondrous-oddities-rs-function-call-semantics</link>
      <category>programming languages</category>
      <category>statistics</category>
      <category>wondrous oddities</category>
      <category>R</category>
      <category>languages</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/24</trackback:ping>
    </item>
  </channel>
</rss>
