Keyword Parameters
Abstract
In order to better support embedded DSLs (for prettyprinting, visualization, type-checking, …) we have two options:
- Support full-fledged syntactic extensibility.
- Add some more flexibility to our data notation so that embedded DSLs that follow the interpreter pattern (i.e., the DSL is actually a data value that is evaluated by a user-defined interpreter) can be reasonably supported.
Here we explore the second alternative.
Current situation
For constructors and function declarations the following parameters mechanisms are supported:
- Positional parameters: “standard” formal parameters like
int x
, but also patterns.
- Variable list of parameters (at the last parameter position):
int x …
We have experience with two libraries that use extensive lists of options and settings:
- Box has many options (indentation, token color, font, etc) all with a default value.
- This is also the case for the Figure library.
One also sees this in, for instance, chart libraries where many options (title, subtitle, x-label, y-label, etc.) have a default value that can be overridden by the caller. It is mandatory that we provide a usable solution for this.
Another data point is that we use data constructors often to model abstract data, while the order of the model elements are irrelevant:
- UML2 or FAMIX like meta-models, such as
data Class = class(str name, set[Field] fields, set[Method] methods)
Proposal
Notably missing are
- Keyword parameters: these parameters have a name, may be used in arbitrary order and (may) have a default value.
- Optional arguments.
Let’s study how Python solves this. In Python formal parameters can be
- Named, positional parameters optionally followed by a default value. A parameter without a default value is a _required parameter, the others are _optional.
- Formal parameter
*x
denotes a variable length parameter (in a function call *x
transforms a list into positional parameters)
- Formal parameter
**x
denotes a variable map of name/value pairs (in a function call **x
transforms a map into positional parameters).
In Python function calls can have positional arguments that match the order of the formal parameters, keyword arguments of the form kwarg=val
, or a mixture of both. Here are examples (from the same website):
def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
print "-- This parrot wouldn't", action,
print "if you put", voltage, "volts through it."
print "-- Lovely plumage, the", type
print "-- It's", state, "!"
accepts one required argument (voltage
) and three optional arguments (state
, action
, and type
). This function can be called in any of the following ways:
parrot(1000) # 1 positional argument
parrot(voltage=1000) # 1 keyword argument
parrot(voltage=1000000, action='VOOOOOM') # 2 keyword arguments
parrot(action='VOOOOOM', voltage=1000000) # 2 keyword arguments
parrot('a million', 'bereft of life', 'jump') # 3 positional arguments
parrot('a thousand', state='pushing up the daisies') # 1 positional, 1 keyword
but all the following calls would be invalid:
parrot() # required argument missing
parrot(voltage=5.0, 'dead') # non-keyword argument after a keyword argument
parrot(110, voltage=220) # duplicate value for the same argument
parrot(actor='John Cleese') # unknown keyword argument
The following rules apply:
- Required arguments need to be present.
- Required arguments come first and maybe followed by keyword arguments.
- Required arguments may also be given as keyword argument.
- An argument may only be bound once.
- A keyword argument should have a known keyword.
Python has more mechanisms such as splicing in a list as positional parameters (see *
above) and passing a dictionary of key/value pairs (see **
above), but this is less likely to work in our typed setting.
Keyword arguments for Rascal
The proposal is to implement the main parts of Python’s keyword arguments. The reasons are as follows:
- It solves the problem of functions with extensive options and settings.
- The order of the formal parameters is completely determined. This is convenient for ADTs like parse trees.
- At the call site arguments (even positional ones) may occur in any order if keywords are used.
- This approach eliminates the need to introduce a separate record type; they are subsumed by the proposed approach.
Example: Figure library
In the Figure library the function ellipse is a figure defined by two constructors:
data Figure = ellipse(list[FProperty] props)
| ellipse(Figure inner, list[FProperty] props) | ... ;
in other words an ellipse only gets a list of properties or an inner figure and
a list of properties. FProperty
is also a data type with many constructors including:
data FProperty = … | shrink(real f) | fillColor(str c) | …;
These are examples:
b1 = ellipse(shrink(0.8), fillColor("green"));
b0 = ellipse(b1, size(150,50), fillColor("lightGray"));
Reformulation 1: repeat keyword parameters
Given this data declaration:
data Figure = ellipse(Figure inner=emptyFigure(), real shrink = 1.0, str fillColor = “white”,
str lineColor = “black”, ...)
The examples can be formulated as:
b1 = ellipse(shrink=0.8, fillColor="green");
b0 = ellipse(inner=b1, size= <150,50>, fillColor="lightGray");
At the call site this solution is perfect but it is not at the data declaration site since we have to repeat the keyword parameters for each constructor and this is bad from a maintenance perspective.
Reformulation 2
Given these data declarations:
data FProperty = props(..., real shrink = 1.0, str fillColor = “white”,
str lineColor = “black”, …);
data Figure = ellipse(Figure inner=emptyFigure(), FProperty props = props())
The examples can be formulated as:
b1 = ellipse(props=props(shrink=0.8, fillColor="green"));
b0 = ellipse(inner=b1, props=props(size=<150,50>, fillColor="lightGray"));
At the call site this solution is bearable but not very nice since we need to add all these props keywords. At the declaration site, the props arguments has to be repeated for every constructor.
Reformulation 3 (with proposal)
What we need is a mechanism to either add keyword parameters to all constructors of a datatype or to splice in the keywords of another constructor. The first approach looks like this:
data Figure(..., real shrink = 1.0, str fillColor = “white”, str lineColor = “black”, ...)
= ellipse(Figure inner=emptyFigure()) | ...;
Here the keyword parameters following the datatype name (Figure
) are added to each constructor of Figure,
The second alternative looks like this:
data FProperty = props(..., real shrink = 1.0, str fillColor = “white”,
str lineColor = “black”, ...);
data Figure = ellipse(Figure inner=emptyFigure(), **props)
| ... ;
Here the keyword parameters of another constructor (props
) are spliced into the constructor declaration.
With both solutions the examples can be formulated as desired:
b1 = ellipse(shrink=0.8, fillColor="green");
b0 = ellipse(inner=b1, size= <150,50>, fillColor="lightGray");
The proposal is to use the first approach:
- In data declarations, the name of the datatype maybe followed by a list of keyword parameters.
- This list can be extended in subsequent declarations, i.,e,
data Figure(str lineStyle = “dotted”);
adds a keyword parameter to an existing ADT.
- The keywords in this list are added automatically to each constructor of the ADT.
- Per constructor, specific keywords can be defined.
Implications for the Rascal Implementation
Syntax
The syntax needs to be adapted in three places:
- Function declaration
- Data declaration.
- Constructor declaration
- Function call.
- Pattern matching
For function and constructor declarations, each formal parameter may now optionally be followed by “= Expr
”. The data declaration is extended to allow declaring keyword parameters shared by all constructors.
Program Data base (PDB)
Default values have to be added to ADTs.
- When printing ADT values, arguments are printed in order and keyword parameters with a value equal to their default value are suppressed.
- Keyword parameters have to be added to abstract INodes as well
- Keyword parameters have to be added to IConstructors.
- The ConstructorType has to be extended with types for the keyword parameters.
Interpretation of call expression
The interpretation of calls has to be extended as follows:
- The function name determines the set of actual functions to be called.
- The values of all (positional and keyword) parameters are computed.
- The function to be called from the set of functions is determined as follows:
** Reorder the parameter so that positional parameters that are referred to by name are placed in their proper position.
** Match the actual values of positional parameters with the corresponding formal postional parameters.
** Determine if the function supports all the used keyword names.
- If this yields a unique choice then called the selected function and
** Bind positional formal parameters to the corresponding actual parameter value.
** Bind keyword formal parameters to the actual values that are given as actual keyword arguments.
** Bind the remaining formal keyword parameters to their default value.
Pattern Matching
The matching of abstract patterns has to be adapted to perform a similar search as sketched for calls.
Also, pattern matching must be extended to be able to ignore keyword parameters. This is slightly different from the calls semantics, where the default values will be used. Here we want to ignore the values of keyword parameters when the parameters are not named in the pattern. This allows later to use keyword parameters instead of annotations.
Potential Benefits
In addition to a much more versatile parameter mechanism, keyword parameters can also play the role of annotations and even make annotations obsolete. This can have _large benefits for achieving maximal sharing_. However, this depends on the interpretation of pattern matching. The keyword parameters must be ignorable.
Potential Pitfalls
- The proposed solution places an extra burden on the call overhead and pattern matching.
- The proposed “common” keyword parameters and the way to add keyword parameters introduces a form of inheritance. Is it sufficient for our needs?
- Ignoring keyword parameters by default while pattern matching is kind of special and may be surprising.
Alternative solutions
- Use annotations to get attach optional values to ADT values.
- Introduce a record type (which has named, unordered, fields).