SAS

OVERVIEW OF SAS RULES

This section provides a systematic summary of some of the features you have seen in the examples up to now.

SAS Syntax

The primary features of SAS syntax are --

Free form - Each SAS statement ends with a semicolon. Statements are otherwise free form - they can start in any column, end in any column, and span more than one line. Also, more than one statement may appear on a line.

Comments -- statements beginning with an asterisk * are comments, not executed by SAS. Comments also can be enclosed in /* ... */. Examples:

    input year     1-4
          wrkstat  9
          hrs1     10-11
          hrs2     12-13
          occ80    27-29  /* occupation--1980 census */
          prestg80 30-31  /* Occupational prestige */
          age      92-93
          educ     97-98
          degree   105 
          sex      109
          race     110
          rincom98 157-158;
    * Keep only recent years *;
     if year >= 1998;

Not case sensitive-- Unlike UNIX commands, SAS commands work whether you type them lower case, upper case, or mixed.
SAS names -- SAS names are defined by you, the user. They refer to variables, data sets, and other concepts. They must (1) start with a letter or underscore, (2) contain only letters, numbers, or underscores, and, with a few execptions, (3) contain 32 or fewer characters. Examples of valid SAS names are --
```
    income
    occ
    depression
    OccupationStatusScore
    occupation_status_score
    faminc
    pulse
    pressure
    rate
    interest_rate
    int_rate
```
The two occupation variables illustrate two ways to visually separate words contained in SAS names: Start each word with an upper-case letter, and/or separate words with an underscore. Internally, SAS does not distinguish between upper-case and lower-case letters in SAS names. But it does distinguish between upper-case and lower-case when it prints output. It prints each variable as it was first defined (most of the time).
Examples of invalid names are -
```
    1stjob^a
    occupation_when_respondent_started_high_school^b
    rtn rate^c
```
^aDoes not start with a letter or underscore
^bToo many characters
^cContains embedded blank

Data Step/Proc Step

SAS is divided into two distinct types of operations, called a data step and a proc step --

data step - programming step: Read data into a SAS data set, perform programming functions
proc step - analysis step: Perform many analysis and utility functions such as calculating means, correlations, regressions; sorting; copying data; listing the contents of SAS data sets; and data entry. Data steps begin with a data statement, and proc steps begin with the proc statement. Both steps end with one of the following:
- a run statement (data step, most procedures)
- quit statement (e.g., proc IML, proc SQL)
- another data statement (data ... )
- another proc statement (proc ... )
We strongly recommend ending each data step and each proc step with a run statement, or, in some instances, a quit statement. Procedures that take a quit statement should be ended with quit;. The quit is important when using interactive SAS with the DMS. Otherwise a procedure such as the gplot continues to run and sometimes freezes your session.
Following these rules helps to structure your programs so you can keep track of what you are doing.

Here is an example of a small SAS program containing both a data step and a proc step --

Note the use of indentation to structure the data step and the proc step.

Procedures cannot read raw data; only data steps read raw data. Data steps normally read data from some source (here the file named income.data) and write out a SAS data set (here called inc).

A built-in looping is performed in the data step. SAS reads a case, executes any instructions included in the data step, writes one case to the SAS data set, then returns to the beginning of the data step to repeat all instructions for the next case. This looping continues until all cases are read.

Files Used with SAS

Files serve five functions for SAS. The five types of files and their functions are listed below. When using interactive SAS (with the DMS), various windows substitute for the files. You must explicitly save the contents of a window to turn it into a file. The name of the window corresponding to each file is listed in parenthesis after the "usual extension."

Input to SAS
- command file -- issues instructions to the SAS programs
  - usual extension: .sas (Window Name: Program Editor)
  - example: contents.sas
- data file - provide input data
  - usual extension: .data or .dat but, increasingly, varies (No window analog)
  - example: income.data
Output from SAS
- log file -- contains the instructions from the command file, error messages, warnings, and notes
  - usual extension: .log (Window name: Log)
  - example: contents.log
- listing file - lists output from SAS procedures
  - usual extension: .lst (Window name: Output)
  - example: contents.lst
Input & Output
- sas data set - provides data to SAS procedures in the required binary format
  - usual extension: sas7bdat (No standard Window analog but Viewtable will display a data set)
  - example: ~larryh/Sasclass/gss04.sas7bdat
- sas catalog - generally contains metadata such as formats in the required binary format
  - usual extension: sas7bcat (No standard Window analog.)
  - example: ~larryh/Sasclass/formats.sas7bcat

Unless you explitly instruct otherwise, SAS creates a SAS data set each time a data step is run. If you do not provide for a permanent SAS data set, a temporary one is created and deleted at the end of the SAS session --

 * Create a temporary SAS dataset. SAS assigns the name *;

  data;

 * Create a temporary SAS data set named work.xyz.
   It may be referred to by xyz or by work.xyz *;

  data xyz; /* Creates a temporary SAS data set named work.xyz */

 * Create permanent SAS dataset. It must be referred to with
   its two-part name.

   The name of the unix file to be created/overwritten in this example
   is thesis.sas7bdat.  It will be stored in the current working
   directory, because of the period in the libname statement. *;

  libname thesis '.'; /* Period refers to current working dir */
  data thesis.earnings;

 * Execute a data step without producing any SAS dataset *;

  data _null_;

When saving the contents of a window, use the file extension corresponding to the window given in this list. Doing so helps you keep track of the contents of your files. Example: always save the contents of the program editor with a .sas extension. Save the contents of the saslog with a .log extension. And, save the contents of the output window with a .lst extension.