1 Language Model
2 Syntactic Forms
3 Datatypes
4 Structures
5 Classes and Objects
6 Units
7 Contracts
8 Pattern Matching
9 Control Flow
10 Concurrency
11 Macros
12 Input and Output
13 Reflection and Security
14 Operating System
15 Memory Management
16 Running PLT Scheme
Bibliography
Index
On this page:
1.2.1 Identifiers and Binding
1.2.2 Syntax Objects
1.2.3 Expansion (Parsing)
1.2.3.1 Fully Expanded Programs
1.2.3.2 Expansion Steps
1.2.3.3 Expansion Context
1.2.3.4 Introducing Bindings
1.2.3.5 Transformer Bindings
1.2.3.6 Partial Expansion
1.2.3.7 Internal Definitions
1.2.3.8 Module Phases
1.2.4 Compilation
1.2.5 Namespaces
1.2.6 Inferred Value Names
Version: 4.0.2

 

1.2 Syntax Model

The syntax of a Scheme program is defined by

For details on the read phase, see The Reader. Source code is normally read in read-syntax mode, which produces a syntax object.

The expand phase recursively processes a syntax object to produce a complete parse of the program. Binding information in a syntax object drives the expansion process, and when the expansion process encounters a binding form, it extends syntax objects for sub-expression with new binding information.

1.2.1 Identifiers and Binding

Identifiers and Binding in Guide: PLT Scheme introduces binding.

An identifier is source-program entity. Parsing (i.e., expanding) a Scheme program reveals that some identifiers correspond to variables, some refer to syntactic forms, and some are quoted to produce a symbol or a syntax object.

An identifier binds another (i.e., it is a binding) when the former is parsed as a variable and the latter is parsed as a reference to the former; the latter is bound. The scope of a binding is the set of source forms to which it applies. The environment of a form is the set of bindings whose scope includes the form. A binding for a sub-expression shadows any bindings (i.e., it is shadowing) in its environment, so that uses of an identifier refer to the shadowing binding. A top-level binding is a binding from a definition at the top-level; a module binding is a binding from a definition in a module; and a local binding is another other kind of binding.

For example, as a bit of source, the text

  (let ([x 5]) x)

includes two identifiers: let and x (which appears twice). When this source is parsed in a typical environment, x turns out to represent a variable (unlike let). In particular, the first x binds the second x.

Throughout the documentation, identifiers are typeset to suggest the way that they are parsed. A black, boldface identifier like lambda indicates as a reference to a syntactic form. A plain blue identifier like x is a variable or a reference to an unspecified top-level variable. A hyperlinked identifier cons is a reference to a specific top-level variable.

Every binding has a phase level in which it can be referenced, where a phase level normally corresponds to an integer (but the special label phase level does not correspond to an integer). Phase level 0 corresponds to the run time of the enclosing module (or the run time of top-level expressions). Bindings in phase level 0 constitute the base environment. Phase level 1 corresponds to the time during which the enclosing module (or top-level expression) is expanded; bindings in phase level 1 constitute the transformer environment. Phase level -1 corresponds to the run time of a different module for which the enclosing module is imported for use at phase level 1 (relative to the importing module); bindings in phase level -1 constitute the template environment. The label phase level does not correspond to any execution time; it is used to track bindings (e.g., to identifiers within documentation) without implying an execution dependency.

If an identifier has a local binding, then it is the same for all phase levels, though the reference is allowed only at a particular phase level. Attempting to reference a local binding in a different phase level than the binding’s context produces a syntax error. If an identifier has a top-level binding or module binding, then it can have different such bindings in different phase levels.

1.2.2 Syntax Objects

A syntax object combines a simpler Scheme value, such as a symbol or pair, with lexical information about bindings, source-location information, syntax properties, and syntax certificates. In particular, an identifier is represented as a symbol object that combines a symbol and lexical and other information.

For example, a car identifier might have lexical information that designates it as the car from the scheme/base language (i.e., the built-in car). Similarly, a lambda identifier’s lexical information may indicate that it represents a procedure form. Some other identifier’s lexical information may indicate that it references a top-level variable.

When a syntax object represents a more complex expression than an identifier or simple constant, its internal components can be extracted. Even for extracted identifier, detailed information about binding is available mostly indirectly; two identifiers can be compared to see if they refer to the same binding (i.e., free-identifier=?), or whether each identifier would bind the other if one was in a binding position and the other in an expression position (i.e., bound-identifier=?).

For example, the when the program written as

  (let ([x 5]) (+ x 6))

is represented as a syntax object, then two syntax objects can be extracted for the two xs. Both the free-identifier=? and bound-identifier=? predicates will indicate that the xs are the same. In contrast, the let identifier is not free-identifier=? or bound-identifier=? to either x.

The lexical information in a syntax object is independent of the other half, and it can be copied to a new syntax object in combination with an arbitrary other Scheme value. Thus, identifier-binding information in a syntax object is predicated on the symbolic name of the identifier as well as the identifier’s lexical information; the same question with the same lexical information but different base value can produce a different answer.

For example, combining the lexical information from let in the program above to 'x would not produce an identifier that is free-identifier=? to either x, since it does not appear in the scope of the x binding. Combining the lexical context of the 6 with 'x, in contrast, would produce an identifier that is bound-identifier=? to both xs.

The quote-syntax form bridges the evaluation of a program and the representation of a program. Specifically, (quote-syntax datum) produces a syntax object that preserves all of the lexical information that datum had when it was parsed as part of the quote-syntax form.

1.2.3 Expansion (Parsing)

Expansion recursively processes a syntax object in a particular phase level, starting with phase level 0. Bindings from the syntax object’s lexical information drive the expansion process, and cause new bindings to be introduced for the lexical information of sub-expressions. In some cases, a sub-expression is expanded in a deeper phase than the enclosing expression.

1.2.3.1 Fully Expanded Programs

A complete expansion produces a syntax object matching the following grammar:

Beware that the symbolic names of identifiers in a fully expanded program may not match the symbolic names in the grammar. Only the binding (according to free-identifier=?) matters.

  top-level-form

 

=

 

general-top-level-form

 

 

|

 

(#%expression expr)

 

 

|

 

(module id name-id

  (#%plain-module-begin

   module-level-form ...))

 

 

|

 

(begin top-level-form ...)

 

 

 

 

 

  module-level-form

 

=

 

general-top-level-form

 

 

|

 

(#%provide raw-provide-spec ...)

 

 

 

 

 

  general-top-level-form

 

=

 

expr

 

 

|

 

(define-values (id ...) expr)

 

 

|

 

(define-syntaxes (id ...) expr)

 

 

|

 

(define-values-for-syntax (id ...) expr)

 

 

|

 

(#%require raw-require-spec ...)

 

 

 

 

 

  expr

 

=

 

id

 

 

|

 

(#%plain-lambda formals expr ...+)

 

 

|

 

(case-lambda (formals expr ...+) ...)

 

 

|

 

(if expr expr expr)

 

 

|

 

(begin expr ...+)

 

 

|

 

(begin0 expr expr ...)

 

 

|

 

(let-values (((id ...) expr) ...)

  expr ...+)

 

 

|

 

(letrec-values (((id ...) expr) ...)

  expr ...+)

 

 

|

 

(set! id expr)

 

 

|

 

(quote datum)

 

 

|

 

(quote-syntax datum)

 

 

|

 

(with-continuation-mark expr expr expr)

 

 

|

 

(#%plain-app expr ...+)

 

 

|

 

(#%top . id)

 

 

|

 

(#%variable-reference id)

 

 

|

 

(#%variable-reference (#%top . id))

 

 

 

 

 

  formals

 

=

 

(id ...)

 

 

|

 

(id ...+ . id)

 

 

|

 

id

A fully-expanded syntax object corresponds to a parse of a program (i.e., a parsed program), and lexical information on its identifiers indicates the parse.

More specifically, the typesetting of identifiers in the above grammar is significant. For example, the second case for expr is a syntax-object list whose first element is an identifier, where the identifier’s lexical information specifies a binding to the define-values of the scheme/base language (i.e., the identifier is free-identifier=? to one whose binding is define-values). In all cases, identifiers above typeset as syntactic-form names refer to the bindings defined in Syntactic Forms.

Only phase levels 0 and 1 are relevant for the parse of a program (though the datum in a quote-syntax form preserves its information for all phase levels). In particular, the relevant phase level is 0, except for the exprs in a define-syntax, define-syntaxes, define-for-syntax, or define-values-for-syntax form, in which case the relevant phase level is 1 (for which comparisons are made using free-transformer-identifier=? instead of free-identifier=?).

1.2.3.2 Expansion Steps

In a recursive expansion, each single step in expanding a syntax object at a particular phase level depends on the immediate shape of the syntax object being expanded:

Thus, the possibilities that do not fail lead to an identifier with a particular binding. This binding refers to one of three things:

1.2.3.3 Expansion Context

Each expansion step occurs in a particular context, and transformers and core syntactic forms may expand differently for different contexts. For example, a module form is allowed only in a top-level context, and it fails in other contexts. The possible contexts are as follows:

Different core syntactic forms parse sub-forms using different contexts. For example, a let form always parses the right-hand expressions of a binding in an expression context, but it starts parsing the body in an internal-definition context.

1.2.3.4 Introducing Bindings

Bindings are introduced during expansion when certain core syntactic forms are encountered:

A new binding in lexical information maps to a new variable. The identifiers mapped to this variable are those that currently have the same binding (i.e., that are currently bound-identifier=?) to the identifier associated with the binding.

For example, in

  (let-values ([(x) 10]) (+ x y))

the binding introduced for x applies to the x in the body, but not the y n the body, because (at the point in expansion where the let-values form is encountered) the binding x and the body y are not bound-identifier=?.

1.2.3.5 Transformer Bindings

In a top-level context or module context, when the expander encounters a define-syntaxes form, the binding that it introduces for the defined identifiers is a transformer binding. The value of the binding exists at expansion time, rather than run time (though the two times can overlap), though the binding itself is introduced with phase level 0 (i.e., in the base environment).

The value for the binding is obtained by evaluating the expression in the define-syntaxes form. This expression must be expanded (i.e. parsed) before it can be evaluated, and it is expanded at phase level 1 (i.e., in the transformer environment) instead of phase level 0.

The if resulting value is a procedure of one argument or as the result of make-set!-transformer on a procedure, then is it used as a syntax transformer (a.k.a. macro). The procedure is expected to accept a syntax object and return a syntax object. A use of the binding (at phase level 0) triggers a call of the syntax transformer by the expander; see Expansion Steps.

Before the expander passes a syntax object to a transformer, the syntax object is extend with a syntax mark (that applies to all sub-syntax objects). The result of the transformer is similarly extended with the same syntax mark. When a syntax object’s lexical information includes the same mark twice in a row, the marks effectively cancel. Otherwise, two identifiers are bound-identifier=? (that is, one can bind the other) only if they have the same binding and if they have the same marks – counting only marks that were added after the binding.

This marking process helps keep binding in an expanded program consistent with the lexical structure of the source program. For example, the expanded form of the program

  (define x 12)

  (define-syntax m

    (syntax-rules ()

      [(_ id) (let ([x 10]) id)]))

  (m x)

is

  (define x 12)

  (define-syntax m

    (syntax-rules ()

      [(_ id) (let ([x 10]) id)]))

  (let-values ([(x) 10]) x)

However, the result of the last expression is 12, not 10. The reason is that the transformer bound to m introduces the binding x, but the referencing x is present in the argument to the transformer. The introduced x is the one left with a mark, and the reference x has no mark, so the binding x is not bound-identifier=? to the body x.

The set! form and the make-set!-transformer procedure work together to support assignment transformers that transformer set! expression. Assignment transformers are applied by set! in the same way as a normal transformer by the expander.

The make-rename-transformer procedure creates a value that is also handled specially by the expander and by set! as a transformer binding’s value. When id is bound to a rename transformer produced by make-rename-transformer, it is replaced with the identifier passed to make-rename-transformer. Furthermore, the binding is also specially handled by syntax-local-value as used by syntax transformers.

In addition to using marks to track introduced identifiers, the expander tracks the expansion history of a form through syntax properties such as 'origin. See Syntax Object Properties for more information.

Finally, the expander uses syntax certificates to control the way that unexported and protected module bindings are used. See Syntax Certificates for more information on syntax certificates.

The expander’s handling of letrec-values+syntaxes is similar to its handling of define-syntaxes. A letrec-values+syntaxes mist be expanded in an arbitrary phase level n (not just 0), in which case the expression for the transformer binding is expanded at phase level n+1.

The expression in a define-for-syntax or define-values-for-syntax form is expanded and evaluated in the same way as for syntax. However, the introduced binding is a variable binding at phase level 1 (not a transformer binding at phase level 0).

1.2.3.6 Partial Expansion

In certain contexts, such as an internal-definition context or module context, forms are partially expanded to determine whether they represent definitions, expressions, or other declaration forms. Partial expansion works by cutting off the normal recursion expansion when the relevant binding is for a primitive syntactic form.

As a special case, when expansion would otherwise add an #%app, #%datum, or #%top identifier to an expression, and when the binding turns out to be the primitive #%app, #%datum, or #%top form, then expansion stops without adding the identifier.

1.2.3.7 Internal Definitions

An internal-definition context corresponds to a partial expansion step (see Partial Expansion). A form that supports internal definitions starts by expanding its first form in an internal-definition context, but only partially. That is, it recursively expands only until the form becomes one of the following:

If the last expression form turns out to be a define-values or define-syntaxes form, expansion fails with a syntax error.

1.2.3.8 Module Phases

A require form not only introduces bindings at expansion time, but also visits the referenced module when it is encountered by the expander. That is, the expander instantiates any define-for-syntaxed variables defined in the module, and also evaluates all expressions for define-syntaxes transformer bindings.

Module visits propagate through requires in the same way as module instantiation. Moreover, when a module is visited, any module that it require-for-syntaxes is instantiated at phase 1, which the adjustment that require-for-template leading back to phase 0 causes the required module to be merely visited at phase 0, not instantiated.

When the expander encounters require-for-syntax, it immediately instantiates the required module at phase 1, in addition to adding bindings scheme phase level 1 (i.e., the transformer environment).

When the expander encounters require and require-for-syntax within a module context, the resulting visits and instantiations are specific to the expansion of the enclosing module, and are kept separate from visits and instantiations triggered from a top-level context or from the expansion of a different module.

1.2.4 Compilation

Before expanded code is evaluated, it is first compiled. A compiled form has essentially the same information as the corresponding expanded form, though the internal representation naturally dispenses with identifiers for syntactic forms and local bindings. One significant difference is that a compiled form is almost entirely opaque, so the information that it contains cannot be accessed directly (which is why some identifiers can be dropped). At the same time, a compiled form can be marshaled to and from a byte string, so it is suitable for saving and re-loading code.

Although individual read, expand, compile, and evaluate operations are available, the operations are often combined automatically. For example, the eval procedure takes a syntax object and expands it, compiles it, and evaluates it.

1.2.5 Namespaces

A namespace is a top-level mapping from symbols to binding information. It is the starting point for expanding an expression; a syntax object produced by read-syntax has no initial lexical context; the syntax object can be expanded after initializing it with the mappings of a particular namespace. A namespace is also the starting point evaluating expanded code, where the first step in evaluation is linking the code to specific module instances and top-level variables.

For expansion purposes, a namespace maps each symbol in each phase level to one of three possible bindings:

An “empty” namespace maps all symbols to top-level variables. Certain evaluations extend a namespace for future expansions; importing a module into the top-level adjusts the namespace bindings for all of the imported named, and evaluating a top-level define form updates the namespace’s mapping to refer to a variable (in addition to installing a value into the variable).

A namespace also has a module registry that maps module names to module declarations (see Modules and Module-Level Variables). This registry is shared by all phase levels.

For evaluation, each namespace encapsulates a distinct set of top-level variables, as well as a potentially distinct set of module instances in each phase. That is, even though module declarations are shared for all phase levels, module instances are distinct for each phase.

After a namespace is created, module instances from existing namespaces can be attached to the new namespace. In terms of the evaluation model, top-level variables from different namespaces essentially correspond to definitions with different prefixes. Furthermore, the first step in evaluating any compiled expression is to link its top-level variable and module-level variable references to specific variables in the namespace.

At all times during evaluation, some namespace is designated as the current namespace. The current namespace has no particular relationship, however, with the namespace that was used to expand the code that is executing, or with the namespace that was used to link the compiled form of the currently evaluating code. In particular, changing the current namespace during evaluation does not change the variables to which executing expressions refer. The current namespace only determines the behavior of (essentially reflective) operations to expand code and to start evaluating expanded/compiled code.

Examples:

  > (define x 'orig) ; define in the original namespace

  ; The following let expression is compiled in the original

  ; namespace, so direct references to x see 'orig.

  > (let ([n (make-base-namespace)]) ; make new namespace

      (parameterize ([current-namespace n])

        (eval '(define x 'new)) ; evals in the new namespace

        (display x) ; displays 'orig

        (display (eval 'x)))) ; displays 'new

  orignew

A namespace is purely a top-level entity, not to be confused with an environment. In particular, a namespace does not encapsulate the full environment of an expression inside local-binding forms.

If an identifier is bound to syntax or to an import, then defining the identifier as a variable shadows the syntax or import in future uses of the environment. Similarly, if an identifier is bound to a top-level variable, then binding the identifier to syntax or an import shadows the variable; the variable’s value remains unchanged, however, and may be accessible through previously evaluated expressions.

Examples:

  > (define x 5)

  > (define (f) x)

  > x

  5

  > (f)

  5

  > (define-syntax x (syntax-id-rules () [_ 10]))

  > x

  10

  > (f)

  5

  > (define x 7)

  > x

  7

  > (f)

  7

  > (module m mzscheme (define x 8) (provide x))

  > (require 'm)

  > x

  8

  > (f)

  7

1.2.6 Inferred Value Names

To improve error reporting, names are inferred at compile-time for certain kinds of values, such as procedures. For example, evaluating the following expression:

  (let ([f (lambda () 0)]) (f 1 2 3))

produces an error message because too many arguments are provided to the procedure. The error message is able to report f as the name of the procedure. In this case, Scheme decides, at compile-time, to name as 'f all procedures created by the let-bound lambda.

Names are inferred whenever possible for procedures. Names closer to an expression take precedence. For example, in

  (define my-f

    (let ([f (lambda () 0)]) f))

the procedure bound to my-f will have the inferred name 'f.

When an 'inferred-name property is attached to a syntax object for an expression (see Syntax Object Properties), the property value is used for naming the expression, and it overrides any name that was inferred from the expression’s context.

When an inferred name is not available, but a source location is available, a name is constructed using the source location information. Inferred and property-assigned names are also available to syntax transformers, via syntax-local-name.