Procedures

Procedure definitions are program-level constructs for defining functions and other global constants.

In common usage, the term “procedure” is synonymous with “function definition”, because most procedures define functions. Mars also lets you use procedures to specify other global constants.

All procedures create a global constant, which is just a non-modifiable variable with a type and a value. The nature of the global constant depends on the form of the procedure.

There are two forms of procedure:

  • A procedure with parameters is a function definition. These create global constants which are functions. The global constant’s value may be passed around without executing the procedure. When the value is applied, the procedure is executed.
  • A procedure without parameters is a computable global constant. These create global constants with arbitrary types. At runtime, the value of the global constant is computed by executing the procedure (it is the value returned by the procedure).

Note

A computable global constant is not necessarily not a function. It may have a function type, if the procedure itself returns a function value.

Procedure definitions have the following syntax:

proceduredef ::=  proc_head ":" NEWLINE
                      INDENT var_decl* statement+ DEDENT
                  | proc_head "=" expression

Procedure header

The procedure header provides the public interface to the procedure:

proc_head   ::=  "def" proc_name [proc_params] "::" type
proc_params ::=  "(" [proc_param ("," proc_param)*] ")"
proc_param  ::=  param_name "::" type
proc_name   ::=  lower_identifier
param_name  ::=  lower_identifier

The procedure header specifies the following information:

  • The name of the procedure.
  • Whether the procedure has parameters (ie. whether it is a function definition or a computable global constant definition).
  • The name and type of each formal parameter.
  • The return type of the procedure.

Functions may optionally have zero or more parameters.

Computable global constants

A procedure without parameters (a computable global constant) has a head of the following form:

def f :: t

This defines a global constant of type t. The constant’s value is the result of evaluating the procedure. The time at which the procedure is evaluated is unspecified. It may be evaluated once at load time, once when it is first used, or upon each usage (depending on the implementation).

Note

Because of the fact that the evaluation time is undefined, computable global constant definitions should not perform side-effects or produce errors. If a zero-parameter procedure is to perform side-effects, it should be declared as a function with zero arguments (a nullary function), thereby making its evaluation time explicit.

Function definitions

A procedure with parameters (a function definition) has a head of the following form:

def f(a1 :: t1, a2 :: t2, ..., an :: tn) :: t

where n >= 0.

This defines a global constant of type (t1, t2, ..., tn) -> t. The constant’s value is a function object which executes the body of the procedure when all arguments are applied.

Unlike computable global constants, the body of a function definition is not executed until it is explicitly applied.

A function with zero arguments is a nullary function.

Note

It is legal, and useful, to define a nullary function (with a head of the form def f() :: t). This is distinct from a computable global constant, because it will have type () -> t, not just t, and will require explicit application, of the form f().

The names a1, a2, ..., an are the names of the function’s formal parameters. These names are not part of the function’s interface (they make no difference to the caller). They are used in the function body, as the names of local-scoped variables of the specified type, and take the value of the corresponding actual parameter at runtime.

It is a compiler error for two parameters to have the same name.

Procedure body

There are two ways to define a procedure – as a single expression, or with a full body.

The “full body” syntax is the most general, and allows for procedures with local variables, and an arbitrary number of statements. It is written in the form:

def f(args) :: rettype:
    declaration 1
    ...
    declaration n

    statement 1
    ...
    statement n

The “single expression” syntax is a short-hand notation, only allowing procedures with no local variables, and a single expression to be evaluated. It is written in the form:

def f(args) :: rettype = expression

This is syntactic sugar for:

def f(args) :: rettype:
    return expression

Local variables

Mars distinguishes between local and global variables. All local variables are scoped to the entire procedure body (there is no block scope). A name refers to a local variable if and only if it:

  • Is a formal parameter, or
  • Is explicitly declared with the var keyword, or
  • Appears on the left-hand side of an assignment statement, or
  • Appears in a pattern.

Local variables may be given the same name as a global constant or function. In this case, the local variable shadows the global constant of the same name.

Local variables may optionally be explicitly declared. For simplicity, all declarations must appear before any statements. Declarations have the following syntax:

var_decl ::=  "var" var_name "::" type NEWLINE
var_name ::=  lower_identifier

Note that formal parameters are considered local variables, and must not be re-declared. It is a compiler error if a variable is declared with the same name as a formal parameter, or if two local variables are declared with the same name.

It is currently an error to declare a local variable with a type which includes a type variable not found in the procedure header. This is generally not a useful thing (since type variables are monomorphic; see type unification), but if it is desired, the variable cannot be explicitly declared.

Local variables (other than parameters) are not initialised with any particular value. The compiler is required to guarantee that variables are not used uninitialised (see semantic errors).

Procedure evaluation

The evaluation of procedures is the heart of the Mars execution model. Mars procedures are evaluated eagerly, and arguments are passed by-value.

Global constant references

If a global constant is named in a variable reference expression, the expression must evaluate to a first-class value.

If the constant is a computable global constant, that value must be the result of evaluating the procedure (which should be a constant [1]). The time at which the procedure is evaluated, and the number of times it is evaluated, is unspecified, and left up to the implementation. Implementations may (but are not limited to) choose one or more of the following evaluation strategies:

  • Evaluate some computable global constants at compile time [2].
  • Evaluate all computable global constants at program load time.
  • Evaluate each computable global constant when it is first referenced, and memoize the result.
  • Evaluate each computable global constant every time it is referenced.

If the constant was defined with a function definition, the expression’s value is a function object which will call the specified function when applied with all parameters.

Function objects and currying

A function object is an opaque first-class value of type (t1, t2, ..., tm) -> t, where m >= 0. Internally, function objects have n formal parameters, the first n - m of which are bound to actual values, and the remaining m of which are unbound (where 0 <= m <= n).

Since Mars features partial function application, a mechanism known as currying takes place. This mechanism is as follows:

When a function is partially applied (using the ... notation), it is applied to i parameters, where i <= m. No evaluation takes place. Instead, the result is of the application is the given function object, but with its first i unbound formal parameters bound to the given actual parameters. The result has type (ti+1, ti+2, ..., tm) -> t.

When a function is fully applied, it must be applied to exactly m parameters. The result of the application is the evaluation of the function with the final m formal parameter bound to the given actual parameters. The result has type t.

Parameter binding

Functions have n formal parameters. A formal parameter is a special local variable, which is bound to a value before the execution of the function’s body. The value which a formal parameter is bound to is called the actual parameter.

Formal parameters may be bound well in advance of the actual evaluation of the function, due to the currying mechanism outlined above. The bound values must be stored in an implementation-defined manner.

As Mars is a call-by-value language, actual parameters are evaluated before being bound to formal parameters. The binding process is to first evaluate the actual parameter expression, and then bind its value to the formal parameter name. The binding is the same as for an assignment statement – it is shallow, and the formal parameter has the same identity as the actual parameter.

The call-by-value mechanism guarantees that direct modification of the value will not affect the caller. Specifically, if the callee re-assigns one of its formal parameters with a new value, that change shall not affect the original value from which the parameter was derived. Nor shall it affect the binding in the callee’s function object (so subsequent uses of the same function object will keep the original binding).

Note

This does not hold up for indirect modification of a value. Mars allows impure constructs such as array element and field update. Since the formal parameter has the same identity as the original actual parameter, making destructive updates to a record or array in the formal parameter will also update the actual parameter, along with any other aliases.

Body execution

The evaluation of a procedure is a matter of executing each statement in the procedure’s body sequentially, according to the execution rules for that kind of statement, until a return statement is encountered.

The return statement’s expression is evaluated, and its value is used as the result of the expression which caused the procedure to be evaluated.

Semantic errors

There are several cases which cause a procedure to be rejected at compilation time:

  • If there is any control-flow path which results in a local variable being referenced before being assigned, the compiler MUST reject the procedure and the program. This doesn’t apply to formal parameters, which are always bound.
  • If there is any control-flow path which does not result in a return statement being encountered, the compiler MUST reject the procedure and the program.

Warning

This second rule is perhaps over-zealous, as it forbids control-flow paths which raise errors but do not return values. A more detailed analysis should be proposed, including noting which procedures always produce errors.

Footnotes

[1]The result of a computable global constant should be a constant, but might not be, if the procedure performs side-effects such as user input. The behaviour of computable global constants with side-effects is undefined, so for all intents and purposes, “computable global constants” may indeed simply be treated as constants.
[2]Implementors are to keep in mind that compilation must terminate in a finite amount of time, while computable global constants may not terminate. Therefore, this evaluation strategy must only be employed conservatively, if evaluation is guaranteed to terminate.