16.2.1 Syntax Objects
The input and output of a macro transformer (i.e., source and replacement forms) are represented as syntax objects. A syntax object contains symbols, lists, and constant values (such as numbers) that essentially correspond to the quoted form of the expression. For example, a representation of the expression (+ 1 2) contains the symbol '+ and the numbers 1 and 2, all in a list. In addition to this quoted content, a syntax object associates source-location and lexical-binding information with each part of the form. The source-location information is used when reporting syntax errors (for example), and the lexical-biding information allows the macro system to maintain lexical scope. To accommodate this extra information, the represention of the expression (+ 1 2) is not merely '(+ 1 2), but a packaging of '(+ 1 2) into a syntax object.
To create a literal syntax object, use the syntax form:
#<syntax:1:0> |
In the same way that ' abbreviates quote, #' abbreviates syntax:
> #'(+ 1 2) |
#<syntax:1:0> |
A syntax object that contains just a symbol is an identifier syntax object. Scheme provides some additional operations specific to identifier syntax objects, including the identifier? operation to detect identifiers. Most notably, free-identifier=? determines whether two identifiers refer to the same binding:
> (identifier? #'car) | ||
#t | ||
> (identifier? #'(+ 1 2)) | ||
#f | ||
> (free-identifier=? #'car #'cdr) | ||
#f | ||
> (free-identifier=? #'car #'car) | ||
#t | ||
> (free-identifier=? #'car #'also-car) | ||
#t | ||
| ||
#f |
The last example above, in particular, illustrates how syntax objects preserve lexical-context information.
To see the lists, symbols, numbers, etc. within a syntax object, use syntax->datum:
> (syntax->datum #'(+ 1 2)) |
(+ 1 2) |
The syntax-e function is similar to syntax->datum, but it unwraps a single layer of source-location and lexical-context information, leaving sub-forms that have their own information wrapped as syntax objects:
(#<syntax:1:0> #<syntax:1:0> #<syntax:1:0>) |
The syntax-e function always leaves syntax-object wrappers around sub-forms that are represented via symbols, numbers, and other literal values. The only time it unwraps extra sub-forms is when unwrapping a pair, in which case the cdr of the pair may be recursively unwrapped, depending on how the syntax object was constructed.
The oppose of syntax->datum is, of course, datum->syntax. In addition to a datum like '(+ 1 2), datum->syntax needs an existing syntax object to donate its lexical context, and optionally another syntax object to donate its source location:
| |||
#<syntax:1:0> |
In the above example, the lexical context of #'lex is used for the new syntax object, while the source location of #'srcloc is used.
When the second (i.e., the “datum”) argument to datum->syntax includes syntax objects, those syntax objects are preserved intact in the result. That is, deconstructing the result with syntax-e eventually produces the syntax objects that were given to datum->syntax.