1 Welcome to PLT Scheme
2 Scheme Essentials
3 Built-In Datatypes
4 Expressions and Definitions
5 Programmer-Defined Datatypes
6 Modules
7 Contracts
8 Input and Output
9 Regular Expressions
10 Exceptions and Control
11 Iterations and Comprehensions
12 Pattern Matching
13 Classes and Objects
14 Units (Components)
15 Reflection and Dynamic Evaluation
16 Macros
17 Performance
18 Running and Creating Executables
19 Compilation and Configuration
20 More Libraries
Bibliography
Index
Version: 4.0.2

 

3.5 Bytes and Byte Strings

A byte is an exact integer between 0 and 255, inclusive. The byte? predicate recognizes numbers that represent bytes.

Examples:

  > (byte? 0)

  #t

  > (byte? 256)

  #f

A byte string is similar to a string – see Strings (Unicode) – but its content is a sequence of bytes instead of characters. Byte strings can be used in applications that process pure ASCII instead of Unicode text. The printed form of a byte string supports such uses in particular, because a byte string prints like the ASCII decoding of the byte string, but prefixed with a #. Unprintable ASCII characters or non-ASCII bytes in the byte string are written with octal notation.

Reading Strings in Reference: PLT Scheme documents the fine points of the syntax of byte strings.

Examples:

  > #"Apple"

  #"Apple"

  > (bytes-ref #"Apple" 0)

  65

  > (make-bytes 3 65)

  #"AAA"

  > (define b (make-bytes 2 0))

  > b

  #"\0\0"

  > (bytes-set! b 0 1)

  > (bytes-set! b 1 255)

  > b

  #"\1\377"

The display form of a byte string writes its raw bytes to the current output port (see Input and Output). Technically, display of a normal (i.e,. character) string prints the UTF-8 encoding of the string to the current output port, since output is ultimately defined in terms of bytes; display of a byte string, however, writes the raw bytes with no encoding. Along the same lines, when this documentation shows output, it technically shows the UTF-8-decoded form of the output.

Examples:

  > (display #"Apple")

  Apple

  > (display "\316\273")  ; same as "λ"

  Î»

  > (display #"\316\273") ; UTF-8 encoding of λ

  λ

For explicitly converting between strings and byte strings, Scheme supports three kinds of encodings directly: UTF-8, Latin-1, and the current locale’s encoding. General facilities for byte-to-byte conversions (especially to and from UTF-8) fill the gap to support arbitrary string encodings.

Examples:

  > (bytes->string/utf-8 #"\316\273")

  "λ"

  > (bytes->string/latin-1 #"\316\273")

  "λ"

  > (parameterize ([current-locale "C"])  ; C locale supports ASCII,

      (bytes->string/locale #"\316\273")) ; only, so...

  bytes->string/locale: byte string is not a valid encoding

  for the current locale: #"\316\273"

  > (let ([cvt (bytes-open-converter "cp1253" ; Greek code page

                                     "UTF-8")]

          [dest (make-bytes 2)])

      (bytes-convert cvt #"\353" 0 1 dest)

      (bytes-close-converter cvt)

      (bytes->string/utf-8 dest))

  "λ"

Byte Strings in Reference: PLT Scheme provides more on byte strings and byte-string procedures.