
In this presentation, we limit our discussion to items 4, 5, and 6. (See "Problem Scope.") A discussion of all steps is presented in the section "Overall Process" within Effective Remediation Strategies by Michael Wheatley. Note that our presentation is to a considerable extent based on Michael Wheatley's presentation.
a stand-alone program that does not interact with other programs either by producing output data that serves as input to other programs or inputting data produced by other programs. In this case, changes can be made to the program and/or to the input files independently.The primary Year 2000 problem we are interested in consists of a single stand-alone program with/without I/O files, which is representative of the typical program used by students and faculty/researchers. The next level of complexity is a program that reads input data which is output from another program and/or produces output data which is used as input by another program. While we expect this situation to be rare, our discussion is applicable to this situation as well. What we shall not address is a highly complex system of interactive programs and I/O files. Therefore, items 1, 2, and 3 above are not relevant, and we will concentrate our discussion and present examples relevant to items 4, 5, and 6 in the process:to
a complex system of interdependent programs. In this case, any change made to one of the programs or to the input/output data files can affect one or more of the other programs and necessitate changes in one or more of them.
The Fortran program, samp1.f, reads the names and birth dates of two individuals and computes the older of the two. The program reads in the dates using 2-digit year representation with format MMDDYY. Try running the program with the sample data set shown. Note that the results for the second data set gives an incorrect answer because this program is not Year 2000 compliant.
While there are variations within these techniques (e.g., fixed windows, movable windows, and sliding windows), these are the basic methods. Encapsulation will not be discussed any further because generally it is not a recommended technique. This remediation technique requires that the application system be isolated and run with the system clock set to a date in the past. As the data comes in to the system, the dates are shifted back 28 years, and when the data leaves the system it is shifted forward 28 years. There are numerous situations in which this method does not work, which is the reason for not considering it as a remediation method. For details on these situations, see the section on encapsulation in "Effective Remediation Strategies and Techniques" by Michael Wheatley.
Below, we consider each of the remediation types and it's impact on the Find/Analysis. This explanation is followed by the details of Find/Analysis.
Within the program itself, you will need to search for date variables. Such variables may be character variables defined with a length of 6 characters (i.e., 2 each for the month, day, and year). You will need to locate these variables and change their definition to accommodate 8 characters. Similarly, date constants within the program will need to be located and modified as well as I/O statements and formats. You will need to examine all functions (e.g., subprograms in Fortran) for date processing. While the logic of such functions may not need to be modified, the data types may need modification (e.g., 6 byte character variables changed to 8 byte). Arithmetic operations within your program may be tailored specifically to 2-digit years and, therefore, need modification when the variables and/or constants they deal with are converted to 4-digit variables/constants.
Expansion usually requires more time to implement because there are 2-digit year data within databases to find and convert as well as variables and literals within the program. For programs with small input files that contain little or no date data, however, expansion may be the easiest method to implement. As always, the right remediation choice is program dependent.
.
010191 102388 120487 .....
910101 881023 871204 .....
jan0191 oct2388 dec0487 .....
This format is relevant to expansion and compression.
character*6 date date = '010198'This format is relevant to expansion and compression.
date = '010199'If expansion is used, then "99" would go to "1999." If compression is used, you would generally need to modify the constant. Using hexadecimal, 99 would go to 63 resulting in
date = '010163'and if this date variable is to be modified in subsequent years, then, for example, in the year 2000, "00" would be replaced by hexadecimal "64".
date = '010164'In this case, if a subsequent internal read were done to extract the value of the year, using format i2, this format would need to be changed to z2.
Variables may be used to read in data that contains dates or to store the result of a computation.
parameter (iycurr = 99) . . read (5,'(i2)')iyvar . . iypass = iycurr - iyvar .If the program is to use expansion as remediation, then iycurr must be reset to 1999 and iyvar read in using i4 format. Also, if subsequent use of iypass depends on it being a 2-digit year value, that logic will need to be changed. If the program is to use compression as remediation, then iycurr, being set to the current year in the year 2000 would need to be defined using either iycurr = z'64' or iycurr = 100 (either form results in the base 10 value of 100). Also, the format i2 needs to change to z2 if iyvar is compressed. Thus, with either expansion or compression, statements of this type need to be detected in the Find/Analysis stage.
This format is relevant to expansion and compression.
This is relevant to expansion and compression.
This format is relevant to expansion and compression.
subroutine prtyr(date) character date*6 print *,'The year is ','19'//date(5:6) return endIf expansion is used, the concatenation of "19" must be eliminated, "date*6" must be changed to "date*8", and "date(5:6)" must be changed to "date(5:8)."
If compression is used, then the last 2 characters of date must be converted to the correct value and added to 1900. Concatenation of the character representation will not work (e.g., the year 2000 represented in hexadecimal is 64, so the character representation of the year after concatenation would be "1964," not "2000"). An example is presented in the remediation section below. In any case, the subroutine will need to be revised. If windowing is used, then this routine will also need revision by calling the window routine. An example is presented in the remediation section below.
This format is relevant to expansion, compression, and windowing.
For additional situations that require Find/Analysis, refer to Potential Problems in Data Files and Potential Problems in Program Logic in Are Your Programs and Data Files Year 2000 Compliant?
The first line which defines the variables date1, date2, date1p, and date2p as character variables will require modification if we use expansion, but not if we use windowing or compression. The two internal read statements will require modification for expansion and compression as the variables y1 and y2 will need to be read in using i4 format in the former and using z2 in the latter. Similarly, the lines defining date1p and date2p will need to have the indices 5:6 changed to 5:8 for expansion. Finally, the print statements need modification by removing the constant 1900 in the cases of expansion and windowing. This explanation of what needs to be done as a function of which remediation method we use is shown in one file by color coding the various remediation schemes.
Our sample program can accept keyboard input or, via redirection,
samp1.exe < inputfilean input data file. Let's look at our input file for date sensitivity.
Earlier, we said that you could scan the input data file for dates, looking for stings conforming to the format MMDDYY, DDMMYY, etc. and/or examine the program for variables that might hold date data and internal constants that might be date values. For our example program, samp1.f, the obvious variables are date1, date2, date1p, date2p, y1, and y2. Finding date1 and date2 in read statements in samp1.f leads us to examine the input files for date data. And, conversely, finding the strings 122506 and 031199 would lead us to look for read statements in our program.
Now that we have an idea of what to look for, how do we find it? Actually, for the problem type we are interested in (see "Problem Scope") this is really not an issue. In this case, the user knows if the program processes date sensitive data and where the data and procedures that process the data are located within the program. Nonetheless, there still may be some instances of a program that has been inherited, in which case the user may not know if there are date sensitive areas within the program. If this is your situation, you will need to search through your program and data to locate these date sensitive areas. You can scan through your program and data files with your favorite editor searching for the types of date sensitive items described above. You can perhaps speed up the process by using one of the available Y2K tools. There are tools for Find/Analysis and tools that automate the remediation process as well. Some of these tools combine both Find/Analysis and remediation. These tools can be further divided into those that operate on the source code and those that operate on the object code. While these tools may speed up the Find/Analysis process and/or the remediation process, they are not foolproof. That is, a scanning program is not guaranteed to catch every instance of date sensitivity in your program and/or input data files. Similarly, the auto-remediated program must be carefully checked for correctness. Also, whereas the vast majority of user programs at the University are Fortran, C, C++, and Pascal, the vast majority of scanning tools and remediation programs are designed for Cobol. Finally, while there may be freeware tools available, the vast majority of tools are available at a fee. For these reasons, but particularly the last two, it was decided that it is not worthwhile to spend time investigating auto methods of scanning and remediation. For a detailed description of Find/Analysis tools and remediation tools, see Effective Remediation Strategies by Michael Wheatley (Note: Here we are referencing the entire paper rather than sections because there are numerous discussions of Find/Analysis and remediation tools applied to source code and object code throughout Wheatley's paper).
Your first inclination might be to use expansion as the remediation method. Conceptually, this is the most straightforward method. It is also the method generally agreed to be the best solution to the Year 2000 problem. After all, you needn't introduce an artificial construct such as a window or go to a different number base such as hexadecimal for remeditation using compression. The same procedures that exist in the original program can remain in the remediated version although some modifications may be required However, expansion is a difficult solution because it involves changing all databases that contain date data as well as locating and changing every instance of date data in the program. Also, it will affect any other program that uses such a database and, thereby, require changes to those programs. Output formats will need to be changed which can translate into considerable reprogramming. Procedural sections of the program may need to be modified if their logic is specific to 2-digit year data. In summary, depending on the program you are converting, expansion may or may not be the appropriate remediation.
Windowing introduces a limitation of a 100-year span over which the program can process dates. However, there is a significant advantage because data files need not be converted from 2-digit dates to 4-digit dates. Also, date literals and I/O format statements do not need to be changed. In general, windowing takes less time. On the negative side, you are introducing additional complexity to the logic of the program that would otherwise not be there if you used expansion. You should be beginning to appreciate now how the program to be remediated determines what remediation technique is most applicable. Careful consideration should be given to the technique to be used.
Finally, there is compression, a technique that enables you to surpass the limitation of the 100-year window, while maintaining 2-digit years. For example, a common choice for compression is to use hexadecimal data which expands the span of years to 256 years. At first glance, this would seem to be the optimal compromise. Compression does not necessitate the introduction of windowing logic into your program. It maintains the use of 2-digit years while expanding the coverage to 256 years. However, compression does necessitate changing input data files from a decimal base to a hexadecimal base. Like the use of expansion, if there are other programs that use the I/O files of the program, they will need to be modified also. Compression typically requires changes to the data definition and procedures in the program also. There are other disadvantages to compression and other advantages in addition to the ones mentioned above. For more details, see the section, Compression within Michael Wheatley's paper. Because of the disadvantages listed above, and those cited in Michael Wheatley's paper, compression is not commonly used as a source code remediation technique.
As you can see each of these methods has its advantages and disadvantages. The method you choose should depend on your program in terms of its complexity, whether you have input databases and how large they are, whether or not you know where date data is located in the databases, how well you know the program in question, and how much time you have to perform the remediation.
Next we consider each remediation technique and apply it to our sample program.
Expansion remediation involves the expansion of dates in your program and input data from 2-digit to 4-digit year format. For example, an input file containing
John Smith 121445 245 S. Westmont Drive Phila Penna Fred Williams 072078 1900 E. Rockport Av Chester Penna . . . . . . . .will need to be modified to
John Smith 12141945 245 S. Westmont Drive Phila Penna
Fred Williams 07201978 1900 E. Rockport Av Chester Penna
. . . .
. . . .
And, this in turn, may necessitate that the associated format statement
be modified as well. For example,
read (8,'(a20,x,a6,x,a25,x,a15)')name,bdate,addr,citstneeds to be changed to
read (8,'(a20,x,a8,x,a25,x,a15)')name,bdate,addr,citstAll date constants in your program must be converted from 2-digit year format to 4-digit year format. For example,
. data dates/010199,121897,......../ ifinal = '012036' .needs to be changed to
. data dates/01011999,12181997,..../ ifinal = '01201936' .Date variables may need to be redeclared (e.g., character*2 variables changed to character*4 variables). For example, the type statement for the previous example might be
character * 6 dates(2000)which must be changed to
character * 8 dates(2000)Procedures may need to changed. For example,
subroutine maxyear(iyr)
data maxyr/98/
if (iyr.gt.maxyr) then
print *, 'Last year = ',1900+iyr
endif
return
end
needs to be changed to
subroutine maxyear(iyr)
data maxyr/1998/
if (iyr.gt.maxyr) then
print *, 'Last year = ',iyr
endif
return
end
Computations that deal with dates may need to modified also.
For example,
C ** Compute the number of years passed since the year, iyr where C iyr is a 2-digit year and icuryr is the current year computed C from a system routine and is a 4-digit year ipassed = icuryr-1900-iyrwill need to be changed to
ipassed = icuryr - iyrbecause iyr is now a 4-digit year. If your program has large input files that contain date data, remediation using expansion can take a considerable amount of time. This is especially true if you need to locate the date strings within input data files.
After a program is remediated using 4-digit years, the job is finished. Unlike windowing, this is almost a permanent solution. "Almost," because changes will again be necessary in the year 10,000! This method is conceptually straightforward. There are no special techniques that need to be introduced into the program's logic as is the case with windowing.
On the negative side, you will need to locate all program variables that contain dates and expand them from 2-digit year dates to 4-digit year dates. You need to convert all 2-digit year data values to 4-digit year values, which can be a large task. Depending on how much date data is in your input file(s), the expansion may result in somewhat slower processing time. If the output from your program contains date data, the conversion to 4-digit year data will require any program that takes this output as input to also convert from 2-digit year to 4-digit year format. As mentioned earlier, one way around this problem, at least on a temporary basis, delaying the conversion of input data to 4-digit years, is to use either an internal or external bridge program. A bridge is typically used with expansion remediation. While the program in question may not require too many changes, remediation of the data files may be a long and difficult task. We look at the use of bridges again below in "Bridges".
The remediated version of our original program, samp1.f, using expansion is samp1expand.f. It will require that input data be 4-digit year data values. Try some example data on the the remediated program.
A fixed window is such that it defines a 100-year period in which your program can function. Years outside this window are not valid. For example, let's suppose the range of data that you need to process is from January 1, 1960 to December 31, 2059. Then any 2-digit year would represent a year in this range. Note that such a window cannot be more than 100 years if a 2-digit year is to be unambiguously represented. For example, 61 represents 1961. (2061 is not in the fixed window's range.) For this example range, a pivot year is defined to be 60. That is,
if yy >= 60, the year is 1900+yy
if yy < 60, the year is 2000+yy
You must add the window logic to your program to convert 2-digit years to the 4-digit years corresponding to the fixed window. Aside from the window logic, there is generally very little modification, if any, that needs to be done to the program. The window logic makes up the bulk of the changes. For example, a subroutine that corresponds to a pivot value of 60 and which can be called to convert any 2-digit year within the window (1969-2059) is
subroutine window (y,ipivot) integer y if (y.ge.ipivot) then y = y + 1900 else y = y + 2000 endif return endNote that if you are going to use a window, one disadvantage of using a fixed window is that as the years progress, the number of years in the future decreases as the number of years in the past increases. For example, with the pivot year set to 60, in the year 2000, this window defines 60 years in the future including the present year and 40 years in the past. However, in the year 2020, the window defines 40 years in the future including the then present year and 60 years in the past. This problem can be overcome by either making the window movable or using a sliding window. These methods are discussed next.
Users of the program can continue to use 2-digit year values on input. Generally, the work involved in this type of remediation is much less than when using expansion. A major advantage is that you do not have to modify input files nor expand date fields on output, which means modifying format statements.
On the negative side, such a program is constrained to a 100-year span. For example, when we remediate our sample program using a fixed window, we cannot process ages that exceed 100 years. In the year 2000, if someone were 101 years old, born in the year 1899, it is not possible to represent this birth date year because, depending on the pivot chosen, 99 will either represent 1999 or 2099 but not 1899. To incorporate the year 1899 in the time span covered, the range or window would have to begin at least in 1899, and, therefore, the window would cover the range 1899-1998, leaving out the then current year 2000. If you want to maintain 2-digit year data and cover more than 100 years, you will need to use compression. Depending on how much date data processing your program does, there will be a performance impact due to the conversion of 2-digit years to 4-digit years. If your program uses output from another program that uses windowing, you must use the same window as that program. Therefore, you probably do not want to use sliding window remediation where the pivot value is chosen automatically (see "Sliding Window" below). The advantages and disadvantages described above pertain to all window remediation methods whether it be fixed, movable, or sliding window remediation. For fixed window remediation, one of the downsides is that each year the future span of years within the window becomes shorter. Eventually, you will need to modify the window by changing the pivot value. Some years in the past will no longer be within the window, which may be also be a problem. Additionally, the program may run a little slower due to the extra processing necessary to translate 2-digit years into 4-digit years.
Here is samp1 using fixed window remediation. Comments are included explaining the changes made.
Try some example data and run the Test Program.
The difference between fixed window remediation and movable window remediation is simply that in the latter case, the value of the pivot is more readily available for modification. For example, if you examine the code for samp1fixedw.f and samp1movew.f, you will find the difference is that in the latter code the pivot variable, ipivot, is contained in a parameter statement at the top of the program. While the pivot value could be read in, it is not advisable.
All of the advantages and disadvantages listed above for the fixed window process apply here also, with the exception that each year the future span of years within the window need not become shorter since you can modify the pivot value. While the value of the pivot could be made an input value, it is generally not advisable (see, for example, Wheatley's discussion on movable windows). In the case where different users are going to use the same program, and where each user may want to perform the computations over various 100 year spans, letting the pivot be an input value would be useful.
Program samp1movew.f is samp1 remediated using a movable window. Comments are included explaining the changes made. Try the Test Program .
Like the two previous window methods, the sliding window method also references a 100 year window. The difference is that the sliding window keeps a constant number of years in the future and past by modifying it's pivot value each year as time goes by. For example, assume that one wants a window such that there are 60 years in the future and 40 years in the past. If the year is 2000, this means that the pivot value will be 60 (the window of time covered is 1960-2059). In the next year, 2001, the pivot would be changed to 61, thereby providing the range (1961-2060).
All of the advantages and disadvantages listed under the fixed window remediation apply here also, with the exception that each year the future span of years within the window is automatically maintained. This advantage eliminates the need for manually making changes to the program in the future.
Program samp1slidew.f is samp1 remediated using a sliding window. Comments are included explaining the changes made. Try the Test Program .
Compression allows one to attain a larger span than 100 years while preserving a 2-digit year. It requires that you select a reference year, typically 1900, and that you use a different base for the 2-digit years. Typically, the choice is hexadecimal, which gives you a 16 x 16 = 256 year window. With a reference year of 1900, the window is (1900-2155). For example, the year 1970 would be represented in hexadecimal as 46 instead of 70 in base 10, and the year 2000 as 64 in hexadecimal which is 100 in base 10.
The obvious advantage of compression is that 2-digit data fields need not be expanded, while the window expands to 256 years. On the negative side, this method does not lend itself to being natural to anyone. It is likely to result in errors when large amounts of data are to be entered into a database by hand. For example, John Smith's date of birth, 01/20/1990, would have the year value entered as 5A. All existing data files will need to be modified to conform, and this process can take considerable time. However, program constants need not be changed. Generally, relative to other remediation schemes, there are very few program changes necessary. Still, the use of hexadecimal does not lend itself to producing a user-unfriendly program.
Program samp1comp.f is samp1 remediated using compression. Comments are included explaining the changes made. Try the Test Program .
A Year 2000 bridge is an interface between one part of a system that is compliant and another that is non-compliant. For example, suppose a program is remediated using compression, thereby allowing a span of 256 years while maintaining 2-digit year data. This program in turn may be dependent on a program, "dataprovider," that generates input data to the program and for some reason will not be remediated in time. We need a means to maintain compatibility between "dataprovider" and our program until "dataprovider" is remediated. The means to do this is called a bridge. The bridge can be an external bridge (i.e., a separate program) or an internal bridge (i.e., a procedure within our program that does whatever is necessary to maintain compatibility). So, a bridge buys you time.
Can you use the bridge on a permanent basis? Not usually. For example, if our bridge is one that bridges between 2-digit year data and a 4-digit expansion remediated program, the bridge must convert the 2-digit year data into usable 4-digit year data for our program. This requires the use of a window within our bridge program and that window defines a 100 year span and, as time goes by, this window may need to be changed (i.e., a new pivot).
Let's consider a specific example. Let us assume that you have a large data base with 2-digit year data that covers a range of years beginning in the early 1900's and continues to the present. This database continues to have entries added to it. In the year 2000, 2-digit year data will be added to the database with the year 2000 represented by 00. In the year 2001, data will be added with the year 2001 represented by 01, etc. If the program in question is to convert the 2-digit year data to the correct 4-digit year, the bridge program will need to use a window to determine which 2-digit years are converted to a 4-digit date with century value 1900 and which with century value of 2000. As time goes by, the yearly data will cover a span of time greater than 100 years unless the early data is deleted from the database. In the latter case, there will come a time when the bounds of the window must be changed unless a sliding window is used. In the former case, the 100 year window algorithm within the bridge program will simply not work. Thus, our bridge program bridges over that time when we do not have complete compatibility between our databases and our remediated program.
Let's consider our sample program, samp1.f, remediated by compression, producing samp1comp.f where we assume that the input data is computed by a program which will not be remediated in time to avoid the millennium bug. We can proceed to remediate this program, but in the meantime, to run our program, samp1.f, we create an external bridge. This bridge will take the output from our program in process and convert the 2-digit base 10 year data to 2-digit hexadecimal year data, thereby, being compatible with samp1. The maximum year we expect in our input data going into year 2000, 2001, and 2002 is 2002. We, therefore, select a fixed window with a pivot value of 3. This defines a window in the 100 year interval (1903 - 2002). Because our sample program deals with ages, one could argue that it should handle centenarian and centenarians plus. However, remember that our original program was also deficient in that manner. So, we assume for simplicity, that we never need to compute outside the 100 year window or if such exceptions do exist that they are processed outside the program as special cases. Test out the bridge program, bridge.f, together with our compression remediated program, samp1comp.f, in the following example of using an external bridge.
For example, in C, you can extract the components of a date string using the "getdate" command. The entries in the string must conform to one of the formats within a template file (see the man pages for details on "getdate" and the template file). For example, your program may read data from an input file with records in one of the following forms:
January 1 1988 10:45 pm 1/1/1988 10:45 pm 1/1/88 22:45:00and extract the month, day, and year components using the "getdate" routine. There are restrictions on both the 2-digit years and 4-digit years because at the system level, time is computed in seconds from the year 1970 and stored in an integer variable. For example, on Copland, under Solaris 2.7, this restricts 2-digit years to the range 1970 to 2037 and 4-digit years to 1902 to 2037. The exact limits on this range may vary from system to system (e.g., under Solaris 2.5, 4-digit years are restricted to the range 1970 to 2037). If your program processes 2-digit year dates and uses a system routine to process this data, you need to verify that the program is producing yearly data as expected.
For example, consider the program, test5.c. This program uses the "getdate" routine as described above. Try some test data including 2-digit years during and outside the range of 1970 to 2037 as input to program test5.
For additional information on the Year 2000 problem, see the Year 2000 Links page.
YEAR 2000 READINESS DISCLOSURE
University of Delaware Year 2000
Compliance Home Page
URL of this document:
http://www.udel.edu/topics/software/general/y2k/y2ksoftware.html
Last modified: 3/3/1999
Copyright
1998, University of Delaware.