Fortran Help

Choosing the correct machine for Scientific computing is very problem dependent. Speed and accuracy are both desirable characteristic of a machine, but the are frequently competing objectives.

Quad precision if Fortran (128 bit floating point)

Fortran 77 does not have a standard method of declaring high precision numbers. This leads to a few conversion problems we will refer to here. The usual way to specify quad precision is with the "REAL*16" fortran statment, the usual way to specify a quad precision litereral is with the "q" exponent notation, e.g, 0.5q0. This is the method we will assume for these programs.

If the generic name of any intrinsic function is used the compiler wih choose the appropriate specific name to match the type of the argument. This makes the code both more readable and easier to convert to another precision. Some programmers, however, prefer, to leave nothing to the compiler and explicitly use the specific name. For not intinsic funtion you must use the correct function to match all the argument types.

Single Double Quad

Declaration REAL REAL*8 REAL*16

Constant 0.1 0.1d0 0.1q0

Specific function name alog dlog qlog

Generic function name log log log

	Single	Double	Quad
Declaration	REAL	REAL*8	REAL*16
Constant	0.1	0.1d0	0.1q0
Specific function name	alog	dlog	qlog
Generic function name	log	log	log

Test machines

The four types of machine currently installed here at the University of Delaware as of June 1998 are:

strauss - Sun Ultra Enterprise 4000 (250 MHz UltraSPARC CPUs each with 4MB of Cache): This machine has 128 bit IEEE compliant floating point numbers. This breaks down to - 1 sign bit - 15 exponent bits - 111 fraction bits. As in all IEEE floating point numbers, a normalized number does not store the leading bit, since it is always one, thus effectively the fraction is 112 bits. The IEEE standards also specify the way these numbers should be operated on with "guard bits" and rounding rules. The intent is to make the results from floating points calculations identical accross machines.
joplin - Cray Research J916/8-102 (200 MHz Cray-compatible CPU): The Cray does not have standard floating point hardware. It has 60 bit and 120 bit numbers. When a Fortran program asks for REAL*16 it gets the 120 bit numbers. The compiler does not recognize the q notation to indicate a quad precision literal.
mozart - IBM RS/6000, model 990 (64KB data cache): The rs/6000 series does have IEEE compliant 64 bit floating point numbers, but it handles quat precision by treating a REAL*16 number as two independtly normalized 64 bit numbers. In some cases this trick can delay the need to normalize the number and squeeze more bits out of the calculations, but this usually does not make up for the fact it does not use as many bits for the fraction. (This also means the range of the numbers the same for double and quad precision on mozart)
Mozart is scheduled to be decommissioned on June 30, 1998
gershwin - SGI Power Challenge XL system (R10000 CPU with 2MB cache): The Power Challenge has quad precision in what is called "64 bit mode", and to get this mode you must use the "-64" option on the f77 compiler. If you try to compile a program in 32 bit mode which has REAL*16 or quad literals it warns you that it is not really doing what you asked for.

Accuracy test

To test the accuracy of the calculations we have a simple program to evaluate expression which require high accuracy. For example adding a small number to a large number and then subtracting off the large number.: ftest - for accuracy

Speed Test

To test he speed of these machine for quad precision we add the terms of a geometric series. The sum of the first 20 million terms of a geometric series can be computed with 20 million multiplies and adds or just one call the power function using the standard formula for the sum a geometric series. In this test all terms are between 0.5 and 1.0 so the roundoffs should be the same order of magnitude. We used Mathematica to find the "exact" answer which is used to compute the number of bits of precision the answer has after 20 million floating point roundings.: sumtest - for speed

Here are the result of sumtest with indri the Sun Ultrasparc 1 on my desk included for reference.

Summary of the sumtest results
Machine Seconds to complete Bits of precision

Mozart 13 92

Gershwin 20 92

Strauss 70 92

Joplin 69 75

Indri 1603 92

Summary of the sumtest results
Machine	Seconds to complete	Bits of precision
Mozart	13	92
Gershwin	20	92
Strauss	70	92
Joplin	69	75
Indri	1603	92

Conclusions

Strauss is very good at doing quad precision arithmetic, and is respectably fast.
Gershwin is the fastest machine we have for double precision, but does not do quite as well on quad precision. Make sure you use the -64 option on the compile to get quad precision.
Joplin is designed for vectorizable problems. The code is not vectorizable, so Joplin does not fare as well. Joplin should not be used for these type of problems since the Cray has a non-standard 120 bit floating point, and there is no speed advantage
Mozart is the fastest machine we have for quad precision, but it is scheduled for decomissioning June 30. Also it has a non-standard quad precision floating point.