Differences in the Collating Sequence of Character Data

Character constants have a different collating sequence on the IBM 3090, which use EBCDIC coding, than on Suns, which use ASCII coding. FORTRAN programs that depend on the EBCDIC sequence must be adjusted for this difference. The differences are shown below.

    EBCDIC:   lower case < upper case < numbers

    ASCII:    numbers < upper case < lower case

Note 1

There are also differences in the ranking of special characters. For example, under the EBCDIC collating sequence, & precedes % while the reverse is true under the ASCII collating sequence.

Note 2

We are considering only character data in this document. If you have an older program that stores literals into REAL variables, the collating sequence for these variables is the same for both ASCII and EBCDIC and follows the collating sequence shown above for ASCII character data.

An alternative to reprogramming to account for these differences is to write FORTRAN function subprograms that compare character data according to the EBCDIC sequence. An example of such a function is shown below. Similar functions can be created for other relational operators.

It is important to note that for intensive character manipulation such as performing a sort on large data files, the increase in cpu usage when using such FORTRAN functions compared to using built-in functions is quite large. Roughly speaking, a program that sorts using these supplemental FORTRAN functions can take 10 times longer than a program using built-in functions.


Example Function: lte(arg1,arg2)

The function lte is similar to the ".lt." relational operator, but lte compares the arguments according to the IBM EBCDIC collating sequence:

 

   logical function lte(arg1,arg2)
   character*92 ebcdic 
   character *(*) arg1,arg2
   data ebcdic /' .<(+|&!$*);^-/,%_>?`:#@''="abcdefghijklmnopqr~stu
1vwxyz{ABCDEFGHI}JKLMNOPQRSTUVWXYZ0123456789'/ 
   do 10 i = 1,min(len(arg1),len(arg2))

   c     Test for equality..if equal go to the next character;
   c     if not continue

     if(arg1(i:i).eq.arg2(i:i))then
       go to 10
     else
       do 15 j=1,92
         if (arg1(i:i).eq.ebcdic(j:j))then
           lte=.true.
           return
         else if (arg2(i:i).eq.ebcdic(j:j))then
           lte=.false.
           return
         end if            
  15   continue
      end if
  10   continue

   c        All characters up to the shorter of the two arguments are equal
   c        and therefore the shorter of the two arguments is less than
   c        the other.  If the lengths are equal, lte is .false.                  
       if(len(arg1).eq.len(arg2))then
          lte=.false.
        else if (len(arg1).lt.len(arg2))then
          lte=.true.
else
          lte=.false.
        end if
        return
        end

Note 1

The first line of the data statement in the program above begins in column 9 and ends in column 72. If you begin your program lines in column 7, then the first line of the data statement must end with the characters "vw" and these two characters must be removed from the continuation line of the data statement.

Note 2

There are 92 characters in the data statement. The double ' (i.e., '') stands for the one character '.

Note 3

  ******************************************************************
  *                                                                *
  *                Correspondence to VS FORTRAN                    *
  *                                                                *
  ****************************************************************** 
  *                                                                *
  *        VS FORTRAN                           UNIX               *
  *        ==========                           ====               *
  *                                                                *
  *  if('A'.lt.'3')then                  if(lte('A','3')then       *
  *    print*, 'EBCDIC'                     print*, 'EBCDIC'       *
  *  end if                              end if                    *
  *                                                                *
  *                                                                *
  ******************************************************************

University of Delaware
June 19, 1994