Shortcuts in Science: Reading Notes of Metaprogramming in R

Cited from the book "Metaprogramming in R"

Metaprogramming is when you write programs that manipulate other programs; in other words, you treat code as data that you can generate, analyze, or modify.

R is a very high-level language where all operations are functions, and all functions are data that you can manipulate.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Chapter 1 Anatomy of a function

There are three parts to a function: its formal parameters, its body, and the environment it is defined in. The functions formals(), body(), and environment() give you these.

formals() returns a list where element names are the parameter names and values are default parameters.

You cannot use the missing function to check for a missing value in a formals function (that function is useful only inside a function call, and in any case there is a difference between a missing parameter and one that doesn’t have a default value), but you can always check whether the value is the empty symbol.

The function body is an expression.

When a function is called, R sets up an environment for it to evaluate this expression in; this environment is called the evaluation environment for the function call.
The evaluation environment is first populated with values for the function’s formal parameters, either provided in the function call or given as default parameters, and then the body executes inside this environment.
Assignments will modify this local environment unless you use the <<- operator, and the result of the function is the last expression evaluated in the body. This is either the last expression in a sequence or an expression explicitly given to the return function.

eval(body(f), list(x = 2))

The eval function evaluates an expression and uses the second argument to look up parameters. You can give it an environment, and the expression will then be evaluated in it, or you can use a list.

When a function, f, is created, it gets associated with environment(f). This environment is the environment where f is defined. When f is invoked, R creates an evaluation environment for f; let’s call it evalenv. The parent of evalenv is set to environment(f). Since environment(f) is the environment where f is defined, having it as the parent of the evaluation environment means that the body of f can see its enclosing scope if f is a closure.

After the evaluation environment is created, the formals of f are added to it as promises. Default parameters will be promises that should be evaluated in the evaluation scope, evalenv. This means they can refer to other local variables or formal parameters. Since these will be put in evalenv and since evalenv’s parent is environment(f), these promises can also refer to variables in the scope where f was defined. Expressions given to f where it is called, however, will be stored as promises that should be called in the calling environment. Let’s call that callenv.

To set up the promises, the delayedAssign function takes two environments as arguments. The first is the environment where the promise should be evaluated, and the second is where it should be stored.

List doesn’t like empty values.

alist() function creates a pair-list, which is a data structure used internally in R for formal arguments. It is the only thing this data structure is used for, but if you start hacking around with modifying parameters of a function, it is the one to use.

Using alist, expressions are also automatically quoted.

Constructing Functions

as.function() takes an alist as input and interprets the last element in it as the new function’s body and the rest as the formal arguments.

If you give as.function a list, it interprets that as just an expression that then becomes the body of the new function.

The environment of the new function is by default the environment in which you call as.function. So to make a closure, you can just call as.function inside another function.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Chapter 2 Inside a Function Call

Getting the Components of the Current Function
Functions do not have names as such; you give functions names when you assign them to variables, but that is a property of the environment where you have the name, not of the function itself.

To get hold of the current function, you can use the function sys.function. This function gives you the definition of the current function, which is what you need, not its name.

When you call any of formals, body, or environment, you don’t use a function name as the first parameter; you give each of them a reference to a function, and they get the function definition from that.

Accessing Actual Function Parameters
There is a difference between a function definition, a description of what a function should do when it is called, and a function instantiation (the actual running code). One such difference is the evaluating environment. Another is that a function instantiation has actual parameters, while a function definition has only formal parameters. The latter are part of the function definition; the former are provided by the caller of the function.

If you actually want the arguments passed to the current function in the form of the promises they are really represented as, you need to get hold of them without evaluating them. substitute() substitutes into an expression the values that variables refer to. This means that variables are replaced by the verbatim expressions; the expressions are not evaluated before they are substituted into an expression.

The substituted expression is not evaluated.

If you set up default parameters that depend on others, you just get them substituted with variable names; you do not get the value assigned to other variables.

You can call substitute with an expression instead of a single variable.

A common use for substitute is to get the expression provided to a function as a string.

deparse() takes an expression and translates it into its text representation.

The actual type of object returned by substitute depends on the expression you give the function and what the expression’s variables refer to. If the expression, after variables have been substituted, is a simple type, that is what substitute returns.

Inside a function if you give substitute a local variable you have assigned to, you also get a value back. This is not because substitute does anything special here. Local variables like these are not promises; you evaluated an expression when you assigned to one.

If we call substitute in the global environment, it considers variables as names and does not substitute them for their values.

A call object refers to an unevaluated function call.

You can translate a call into a list to get its components, and you can evaluate it to invoke the actual function call.

Unlike substitute inside a function, however, the arguments to call are evaluated when the call object is constructed. These are not lazy-evaluated.

You can also create call objects manually using the call function. The first argument to call is the name of the function to call, and any additional arguments are passed on to this function when the call object is evaluated.

From inside a function, you can get the call used to invoke it using the match.call function.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Accessing the Calling Scope

If you want direct access to the calling environment, inside a function, you can get hold of it using the function parent.frame().

To keep an expression unevaluated, using expression().

Shortcuts in Science

Thursday 20 July 2017

Reading Notes of Metaprogramming in R

No comments:

Post a Comment