3 @-Reader
The Scribble @-reader is designed to be a convenient facility for using free-form text in Scheme code, where “@” is chosen as one of the least-used characters in Scheme code.
You can use the reader via MzScheme’s #reader form:
#reader(lib "reader.ss" "scribble")@{This is free-form text!}
Note that the reader will only read @-forms as S-expressions. The meaning of these S-expressions depends on the rest of your own code.
A PLT Scheme manual more likely starts with
which installs a reader, wraps the file content afterward into a MzScheme module, and parses the body into a document using scribble/decode. See Document Reader for more information.
Another way to use the reader is to use the use-at-readtable function to switch the current readtable to a readtable that parses @-forms. You can do this in a single command line:
mzscheme -ile scribble/reader "(use-at-readtable)"
3.1 Concrete Syntax
Informally, the concrete syntax of @-forms is
@ 〈cmd〉 [ 〈datum〉* ] { 〈text-body〉* }
where all three parts after @ are optional, but at least one should be present. (Note that spaces are not allowed between the three parts.) @ is set as a non-terminating reader macro, so it can be used as usual in Scheme identifiers unless you want to use it as a first character of an identifier; in this case you need to quote with a backslash (\@foo) or quote the whole identifier with bars (|@foo|).
(define |@foo| ’\@bar@baz)
Of course, @ is not treated specially in Scheme strings, character constants, etc.
Roughly, a form matching the above grammar is read as
(〈cmd〉 |
〈datum〉* |
〈parsed-body〉*) |
where 〈parsed-body〉 is the translation of each 〈text-body〉 in the input. Thus, the initial 〈cmd〉 determines the Scheme code that the input is translated into. The common case is when 〈cmd〉 is a Scheme identifier, which generates a plain Scheme form.
A 〈text-body〉 is made of text, newlines, and nested @-forms. Note that the syntax for @-forms is the same in a 〈text-body〉 context as in a Scheme context. A 〈text-body〉 that isn’t an @-form is converted to a string expression for its 〈parsed-body〉, and newlines are converted to "\n" expressions.
|
| reads as |
| ||||
| |||||||
|
| reads as |
| ||||
| |||||||
|
| reads as |
| ||||
| |||||||
|
| reads as |
|
Note that spaces are not allowed before a [ or a {, or they will be part of the following text (or Scheme code). (More on using braces in body texts below.)
| @foo{bar @baz[2 3] {4 5}} | reads as | (foo "bar " (baz 2 3) " {4 5}") |
When the above @-forms appear in a Scheme expression context, the lexical environment must provide bindings for foo (as a procedure or a macro).
| |||||||
"/Note/: *This is _not_ a pipe*." |
If you want to see the expression that is actually being read, you can use Scheme’s quote.
| '@foo{bar} | reads as | '(foo "bar") |
3.1.1 The Command Part
Besides being a Scheme identifier, the 〈cmd〉 part of an @-form can have Scheme punctuation prefixes, which will end up wrapping the whole expression.
| @`',@foo{blah} | reads as | `',@(foo "blah") |
When writing Scheme code, this means that @`',@foo{blah} is exactly the same as `@',@foo{blah} and `',@@foo{blah}, but unlike the latter two, the first construct can appear in body texts with the same meaning, whereas the other two would not work (see below).
After the optional punctuation prefix, the 〈cmd〉 itself is not limited to identifiers; it can be any Scheme expression.
| @(lambda (x) x){blah} | reads as | ((lambda (x) x) "blah") |
| @`(unquote foo){blah} | reads as | `(,foo "blah") |
In addition, the command can be omitted altogether, which will omit it from the translation, resulting in an S-expression that usually contains, say, just strings:
|
| reads as |
| ||||
| |||||||
|
| reads as |
|
If the command part begins with a ; (with no newline between the @ and the ;), then the construct is a comment. There are two comment forms, one for arbitrary-text and possibly nested comments, and another one for line comments:
@;{ 〈any〉* } |
|
@; 〈anything-else-without-newline〉* |
In the first form, the commented body must still parse correctly; see the description of the body syntax below. In the second form, all text from the @; to the end of the line and all following spaces (or tabs) are part of the comment (similar to % comments in TeX).
|
| reads as | (foo "bar bazblah") |
Tip: if you’re editing in a Scheme-aware editor (like DrScheme or Emacs), it is useful to comment out blocks like this:
@;{ |
... |
;} |
so the editor does not treat the file as having unbalanced parenthesis.
If only the 〈cmd〉 part of an @-form is specified, then the result is the command part only, without an extra set of parenthesis. This makes it suitable for Scheme escapes in body texts. (More on this below, in the description of the body part.)
| @foo{x @y z} | reads as | (foo "x " y " z") |
| @foo{x @(* y 2) z} | reads as | (foo "x " (* y 2) " z") |
| @{@foo bar} | reads as | (foo " bar") |
Finally, note that there are currently no special rules for using @ in the command itself, which can lead to things like:
| @@foo{bar}{baz} | reads as | ((foo "bar") "baz") |
3.1.2 The Datum Part
The datum part can contains arbitrary Scheme expressions, which are simply stacked before the body text arguments:
| @foo[1 (* 2 3)]{bar} | reads as | (foo 1 (* 2 3) "bar") |
| @foo[@bar{...}]{blah} | reads as | (foo (bar "...") "blah") |
The body part can still be omitted, which is essentially an alternative syntax for plain (non-textual) S-expressions:
| @foo[bar] | reads as | (foo bar) |
| @foo{bar @f[x] baz} | reads as | (foo "bar " (f x) " baz") |
The datum part can be empty, which makes no difference, except when the body is omitted. It is more common, however, to use an empty body for the same purpose.
| @foo[]{bar} | reads as | (foo "bar") |
| @foo[] | reads as | (foo) |
| @foo | reads as | foo |
| @foo{} | reads as | (foo) |
The most common use of the datum part is for Scheme forms that expect keyword-value arguments that precede the body of text arguments.
| @foo[#:style 'big]{bar} | reads as | (foo #:style 'big "bar") |
3.1.3 The Body Part
The syntax of the body part is intended to be as convenient as possible for free text. It can contain almost any text – the only characters with special meaning is @ for sub-@-forms, and } for the end of the text. In addition, a { is allowed as part of the text, and it makes the matching } be part of the text too – so balanced braces are valid text.
| @foo{f{o}o} | reads as | (foo "f{o}o") |
| @foo{{{}}{}} | reads as | (foo "{{}}{}") |
As described above, the text turns to a sequence of string arguments for the resulting form. Spaces at the beginning and end of lines are discarded, and newlines turn to individual "\n" strings (i.e., they are not merged with other body parts); see also the information about newlines and indentation below. Spaces are not discarded if they appear after the open { (before the closing }) when there is also text that follows (precedes) it; specifically, they are preserved in a single-line body.
| @foo{bar} | reads as | (foo "bar") |
| @foo{ bar } | reads as | (foo " bar ") |
| @foo[1]{ bar } | reads as | (foo 1 " bar ") |
If @ appears in a body, then it is interpreted as Scheme code, which means that the @-reader is applied recursively, and the resulting syntax appears as part of the S-expression, among other string contents.
| @foo{a @bar{b} c} | reads as | (foo "a " (bar "b") " c") |
If the nested @ construct has only a command – no body or datum parts – it will not appear in a subform. Given that the command part can be any Scheme expression, this makes @ a general escape to arbitrary Scheme code.
| @foo{a @bar c} | reads as | (foo "a " bar " c") |
| @foo{a @(bar 2) c} | reads as | (foo "a " (bar 2) " c") |
This is particularly useful with strings, which can be used to include arbitrary text.
| @foo{A @"}" marks the end} | reads as | (foo "A } marks the end") |
Note that the escaped string is (intentionally) merged with the rest of the text. This works for @ too:
| @foo{The prefix: @"@".} | reads as | (foo "The prefix: @.") |
| @foo{@"@x{y}" --> (x "y")} | reads as | (foo "@x{y} --> (x \"y\")") |
Alternative Body Syntax
In addition to the above, there is an alternative syntax for the body, one that specifies a new marker for its end: use |{ for the opening marker to have the text terminated by a }|.
| @foo|{...}| | reads as | (foo "...") |
| @foo|{"}" follows "{"}| | reads as | (foo "\"}\" follows \"{\"") |
| @foo|{Nesting |{is}| ok}| | reads as | (foo "Nesting |{is}| ok") |
This applies to sub-@-forms too – the @ must be prefixed with a |:
|
| reads as |
| ||||||
| |||||||||
| @t|{In |@i|{sub|@"@"s}| too}| | reads as | (t "In " (i "sub@s") " too") |
Note that the subform uses its own delimiters, {...} or |{...}|. This means that you can copy and paste Scribble text with @-forms freely, just prefix the @ if the immediate surrounding text has a prefix.
For even better control, you can add characters in the opening delimiter, between the | and the {. Characters that are put there (non alphanumeric ASCII characters only, excluding { and @) should also be used for sub-@-forms, and the end-of-body marker should have these characters in reverse order with paren-like characters ((, [, <) mirrored.
| @foo|<<<{@x{foo} |@{bar}|.}>>>| | reads as | (foo "@x{foo} |@{bar}|.") |
| @foo|!!{X |!!@b{Y}...}!!| | reads as | (foo "X " (b "Y") "...") |
Finally, remember that you can use an expression escape with a Scheme string for confusing situations. This works well when you only need to quote short pieces, and the above works well when you have larger multi-line body texts.
Scheme Expression Escapes
In some cases, you may want to use a Scheme identifier (or a number or a boolean etc.) in a position that touches the following text; in these situations you should surround the escaped Scheme expression by a pair of | characters. The text inside the bars is parsed as a Scheme expression.
| @foo{foo@bar.} | reads as | (foo "foo" bar.) |
| @foo{foo@|bar|.} | reads as | (foo "foo" bar ".") |
| @foo{foo@3.} | reads as | (foo "foo" 3.0) |
| @foo{foo@|3|.} | reads as | (foo "foo" 3 ".") |
This form is a generic Scheme expression escape, there is no body text or datum part when you use this form.
| @foo{foo@|(f 1)|{bar}} | reads as | (foo "foo" (f 1) "{bar}") |
| @foo{foo@|bar|[1]{baz}} | reads as | (foo "foo" bar "[1]{baz}") |
This works for string expressions too, but note that unlike the above, the string is (intentionally) not merged with the rest of the text:
| @foo{x@"y"z} | reads as | (foo "xyz") |
| @foo{x@|"y"|z} | reads as | (foo "x" "y" "z") |
Expression escapes also work with any number of expressions,
| @foo{x@|1 (+ 2 3) 4|y} | reads as | (foo "x" 1 (+ 2 3) 4 "y") | ||||
| |||||||
|
| reads as |
|
It seems that @|| has no purpose – but remember that these escapes are never merged with the surrounding text, which can be useful when you want to control the sub expressions in the form.
|
| reads as |
|
Note that @|{...}| can be parsed as either an escape expression or as the Scheme command part of a @-form. The latter is used in this case (since there is little point in Scheme code that uses braces.
| @|{blah}| | reads as | ("blah") |
Comments
As noted above, there are two kinds of Scribble comments: @;{...} is a (nestable) comment for a whole body of text (following the same rules for @-forms), and @;... is a line-comment.
|
| reads as |
|
One useful property of line-comments is that they continue to the end of the line and all following spaces (or tabs). Using this, you can get further control of the subforms.
|
| reads as | (foo "A long single-string arg.") |
Note how this is different from using @||s in that strings around it are not merged.
Spaces, Newlines, and Indentation
The Scribble syntax treats spaces and newlines in a special way is meant to be sensible for dealing with text. As mentioned above, spaces at the beginning and end of body lines are discarded, except for spaces between a { and text, or between text and a }.
| @foo{bar} | reads as | (foo "bar") | ||||
| |||||||
| @foo{ bar } | reads as | (foo " bar ") | ||||
| |||||||
|
| reads as |
|
A single newline that follows an open brace or precedes a closing brace is discarded, unless there are only newlines in the body; other newlines are read as a "\n" string
|
| reads as | (foo "bar") | |||||||||
| ||||||||||||
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as | (foo "\n") | |||||||||
| ||||||||||||
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as |
|
In the parsed S-expression syntax, a single newline string is used for all newlines; you can use eq? to identify this line. This can be used to identify newlines in the original 〈text-body〉.
| |||||||
|
Spaces at the beginning of body lines do not appear in the resulting S-expressions, but the column of each line is noticed, and all-space indentation strings are added so the result has the same indentation. A indentation string is added to each line according to its distance from the leftmost syntax object (except for empty lines). (Note: if you try these examples on a mzscheme REPL, you should be aware that the reader does not know about the "> " prompt.)
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as |
| |||||||||
| ||||||||||||
|
| reads as |
|
If the first string came from the opening { line, it is not prepended with an indentation (but it can affect the leftmost syntax object used for indentation). This makes sense when formatting structured code as well as text (see the last example in the following block).
|
| reads as |
| ||||||
| |||||||||
|
| reads as |
| ||||||
| |||||||||
|
| reads as |
| ||||||
| |||||||||
|
| reads as |
| ||||||
| |||||||||
|
| reads as |
| ||||||
| |||||||||
|
| reads as |
|
Note that each @-form is parsed to an S-expression that has its own indentation. This means that Scribble source can be indented like code, but if indentation matters then you may need to apply indentation of the outer item to all lines of the inner one. For example, in
@code{ |
begin |
i = 1, r = 1 |
@bold{while i < n do |
r *= i++ |
done} |
end |
} |
a formatter will need to apply the 2-space indentation to the rendering of the bold body.
Note that to get a first-line text to be counted as a leftmost line, line and column accounting should be on for the input port (use-at-readtable turns them on for the current input port). Without this,
@foo{x1 |
x2 |
x3} |
will not have 2-space indentations in the parsed S-expression if source accounting is not on, but
@foo{x1 |
x2 |
x3} |
will (due to the last line). Pay attention to this, as it can be a problem with Scheme code, for example:
@code{(define (foo x) |
(+ x 1))} |
For rare situations where spaces at the beginning (or end) of lines matter, you can begin (or end) a line with a "".
|
| reads as |
|
3.2 Syntax Properties
The Scribble reader attaches properties to syntax objects. These properties might be useful in some rare situations.
Forms that Scribble reads are marked with a 'scribble property, and a value of a list of three elements: the first is 'form, the second is the number of items that were read from the datum part, and the third is the number of items in the body part (strings, sub-forms, and escapes). In both cases, a 0 means an empty datum/body part, and #f means that the corresponding part was omitted. If the form has neither parts, the property is not attached to the result. This property can be used to give different meanings to expressions from the datum and the body parts, for example, implicitly quoted keywords:
(define-syntax (foo stx) |
(let ([p (syntax-property stx 'scribble)]) |
(printf ">>> ~s\n" (syntax->datum stx)) |
(syntax-case stx () |
[as '()] |
[xs (syntax->list #'(x ...))]) |
(with-syntax ([attrs (reverse as)] |
[(x ...) xs]) |
(loop (sub1 n) |
(cons (with-syntax ([key (car xs)] |
[val (cadr xs)]) |
#'(key ,val)) |
as) |
(cddr xs))))]))) |
> @foo[x 1 y (* 2 3)]{blah} |
>>> (foo x 1 y (* 2 3) "blah") |
(foo ((x 1) (y 6)) "blah") |
In addition, the Scribble parser uses syntax properties to mark syntax items that are not physically in the original source – indentation spaces and newlines. Both of these will have a 'scribble property; an indentation string of spaces will have 'indentation as the value of the property, and a newline will have a '(newline S) value where S is the original newline string including spaces that precede and follow it (which includes the indentation for the following item). This can be used to implement a verbatim environment: drop indentation strings, and use the original source strings instead of the single-newline string. Here is an example of this.
(define-syntax (verb stx) |
(syntax-case stx () |
#`(cmd |
#,@(let loop ([items (syntax->list #'(item ...))]) |
'() |
[prop (syntax-property fst 'scribble)] |
[rst (loop (cdr items))]) |
(cons fst rst)] |
fst (cadr prop) fst) |
rst)])))))])) |
| ||||
"foo\n bar" |
3.3 Interface
The scribble/reader module provides direct Scribble reader functionality for advanced needs.
in : input-port? = (current-input-port) |
(read-syntax [source-name in]) → (or/c syntax? eof-object?) |
source-name : any/c = (object-name in) |
in : input-port? = (current-input-port) |
These procedures implement the Scribble reader. They do so by constructing a reader table based on the current one, and using that in reading.
(read-inside [in]) → any |
in : input-port? = (current-input-port) |
(read-syntax-inside [source-name in]) → (or/c syntax? eof-object?) |
source-name : any/c = (object-name in) |
in : input-port? = (current-input-port) |
These -inside variants parse as if starting inside a @{...}, and they return a (syntactic) list. Useful for implementing languages that are textual by default (see "docreader.ss" for example).
| ||||||||||||||||||||
readtable : readtable? = (current-readtable) | ||||||||||||||||||||
command-char : character? = #\@ | ||||||||||||||||||||
start-inside? : any/c = #f | ||||||||||||||||||||
| ||||||||||||||||||||
Constructs an @-readtable. The keyword arguments can customize the resulting reader in several ways:
readtable – a readtable to base the @-readtable on.
command-char – the character used for @-forms.
datum-readtable – determines the readtable used for reading the datum part. A #t values uses the @-readtable, otherwise it can be a readtable, or a readtable-to-readtable function that will construct one from the @-readtable. The idea is that you may want to have completely different uses for the datum part, for example, introducing a convenient key=val syntax for attributes.
syntax-post-proc – function that is applied on each resulting syntax value after it has been parsed (but before it is wrapped quoting punctuations). You can use this to further control uses of @-forms, for example, making the command be the head of a list:
#:syntax-post-processor
(lambda (stx)
(syntax-case stx ()
[else (error "@ forms must have a body")])))
start-inside? – if true, creates a readtable for use starting in text mode, instead of S-expression mode.
(use-at-readtable ) → void? |
Passes all arguments to make-at-readtable, and installs the resulting readtable using current-readtable. It also enables line counting for the current input-port via port-count-lines!.