From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,efbcb7a47025db16,start X-Google-Attributes: gid103376,public From: mfb@mbunix.mitre.org (Michael F Brenner) Subject: Effort to Port TSG to Ada-95 and How Fast it Ran in Gnat Date: 1997/02/19 Message-ID: <5efjta$jlm@top.mitre.org> X-Deja-AN: 219898825 Sender: Mike Brenner Summary: It was easy to port and it ran faster. Organization: The MITRE Corporation, Bedford Mass. Keywords: Ada-95, gnat, port, conversion, optimization, program_error Newsgroups: comp.lang.ada Date: 1997-02-19T00:00:00+00:00 List-Id: This is a report on the conversion of a 50,000 line Ada-83 software application to Ada-95. Included are a list of the work that had to be done, the amount of time it took, and the timing of the optimizations. The computer program, called tsg, manipulates a large number of integers and small character strings (24 characters at the most). Some of the source code is on the PAL, some of it has been posted to this group, and all of it is Free. It was originally written on a Janus compiler, but rapidly exceeded the maximum size of the symbol table and object code for Janus 16 bit. It works on the following Ada-83: Meridian (pc), Alsys (pc), Dec (vaxen), and Sun/Verdix (sparc). It was ported to Ada-95 (it now works on gnat), but remaining backwards compatible to Ada-83. Part of this compatibility is done by selecting different package bodies for the same package, because this is how tsg maintains compatiblity amongst the Ada-83 compilers. The work that had to be done was the following: (1) A generic package had to be duplicated for unsigned (modular) numbers. (2) The package body for bit manipulation routines had to use PACKAGE INTERFACES (the bit packing bugs in gnat will soon be or already have been fixed depending on which release you have, so the old bit code will work again starting at that release of gnat). Now there is nothing wrong with using package INTERFACES, except that it does not exist in Ada-83. And during porting it would be preferred to do this as a second optimizing pass rather than in the initial porting pass. (3) In one place, a string had to be identified as type string using qualified expressions because Ada-95 thought it MIGHT a be wide_string. (4) Redundant subtype indicators were needed around some expressions controlling a CASE statement, e.g. x: my_subtype; case my_subtype (x) is (5) Certain packages and variable names had to be renamed to permit gnat chopping, because the gnat rule gave them the same name (for example, try gnatchopping the serial port interrupt handler that comes in the EXAMPLES directory of the Alsys compiler). (6) Some variables and nested functions had to be renamed when they had the same names as variables, functions, packages, and types in which they were nested. (7) Operating system interface functions had to be downloaded from the Net, because, for example, gnat os_lib does not have the same functions in dos and unix, for example, get_first and get_next directory. Command line had to be written in assembly code because neither Ada-95 (for example ADA.COMMAND_LINE) nor gnat libraries give you access to the dos or unix command line. Rather they give you an array of arguments which are not constructed the same in dos and unix. Ada-95 does not come out of the box with a method of deriving the original command line from the args. (8) The tiny bit of floating point computations (used for computing statistics at the very end) had to have the TRUNCATE function reprogrammed (actually simplified to RETURN FLOATES'TRUNCATION (X)). This was because Ada-95 does rounding simpler, that is, different from Ada-83. Again this brings up the suggestion of passing packages to generic packages, so we can manage multiple (selectable) package bodies WITHIN the Ada 2005 language. (9) A floating point assertion failed (an impossible error) in which a particular number was both > and <= another number at the same time. This only happened on the Sparc, not in the same code on a pc, nor on the same code on the Verdix/Sparc Ada-83 compiler. It required an extra epsilon comparison to be made instead of using the built-in <= operator. (10) The packages had internal regression tests, run at elaboration time. Many of these called other packages. The Ada-83 compilers figured out a valid elaboration order, which was easy because there were no complications like recursion, looping, inputs from the user, generic instantiations with varying subtypes, or anything else which would require run-time determination of elaboration order. The following algorithm would work: if the regression test invoked at the bottom of a package calls a procedure in a different package, then please elaborate that package body before the calling package body. However, Ada-95 permits (and gnat seem to demand) a compiler to get confused whenever there is an elaboration order consistent with the WITH order, but inconsistent with the regression test calling order. GDB was used to identify each program_error, and those regression tests were made visible and called externally to the package. This solved the problem. The ones giving the program_error were the ones elaborated between the visible part and body of a different package. In some cases, we cold still call the regression test in the package body by identifying everything it called as either pragma pure, pragma preelaborate, or pragme elaborate_body. Pragma elaborate_all did not help, because it gave the message cannot elaborate package X before itself. The beautiful testing philosophy, to always elaborate a regression test inside each package body, is demised forever, because it is possible to program software that is extremely difficult or impossible to determine the elaboration order. The new philosophy has to be to invoke tests externally from the packages being tested. (11) An extra seek had to be put into the stream_io open for append on the PC because it did not automatically seek the end of the file, tho it did overwrite the existing file rather than open a new file. (12) Temporarilly (because of an early bug in gnat since fixed) a procedure had to be written to get and set the elements of a packed array of bytes. This temporary package is now gone and ordinary array manipulation of packed arrays with greater than 64K bytes now works. (13) Major changes had to be made in the line of code that reopened an output file for input, to accomodate stream_io. (14) Our methods of getting an immediate key did not work in Ada-95, and neither did get_immediate. We downloaded some assembly code that does the job from the Net. It is strongly recommended that the basic operations of getting key strokes (is_available and get_immediate) be semantically defined in Ada 2005 so that it works correctly whenever a keyboard is supported. Having a get_immediate, but having it wait for the user to press a stroke is worse than not having it, because it prevents you from estimating that you will need to program it. TIME IT TOOK TO PORT All of the above was done on one Sunday (about 4 hours on my part and about 4 hours on the part of the person who downloaded the get_first directory code). Because I knew you would ask: I counted the number of semicolons in the code, using the most abused unix sequence ever invented: command cat *.ad? | grep - c \; This gave me a value of 51743 semicolons. To count the number of units, (which equals the number of files under gnat), I used the command ls -1 *.ad? > \tmp\tmp.tmp; vi \tmp\tmp.tmp This displayed that there were 766 units and 12541 characters in the code. 41 units were changed (including 2 brand new packages which were 4 units). One of those 41 units, the one providing get_next_directory under unix, took half of the porting time. The size of the unit providing get_next was 7 lines visible part, 131 lines body, and its works by generating a unix ls command, spawning it, then using text_io to read the directory listing back in, which is horrible, but it works. The size of the other new package was 199 semicolons in a visible part and no body (although it instantiated a body consisting of 1872 semicolons). 636 lines of code were changed, many using global search and replace commands in a text editor; this was counted by adding up the total number of lines produced by running DIFF like this: diff -e x.adb x.bak Actually more than 636 lines were changed: some lines were changed several times, and some were changed and then changed back after learning Ada-95 and gnat better. However, 636 insertions and deletions would be needed to take the old baseline into the new baseline. The code was originally an object oriented (bottom-up) design with one package per data structure which made the porting go much faster than, say, a traditional, monolithic top-down program. Other things that saved time in the porting: Ada-95 is very close to being a superset of Ada-83, the code used no pointers and almost no unchecked_conversions, the machine dependencies were strongly isolated (keyboard, command line, append files). I think the most important thing that made the port go fast was that it had already been ported to several Ada-83 compilers, so it was VERY compiler independent. SPEED OF EXECUTION The following charts show the execution times on Unix Sparc Ada-83, Unix Sparc gnat 3.08, DOS Alsys Ada-83, and DOS gnat 3.07. All of these compilers have recently been upgraded to later versions, so results should improve if you repeat these timings today. The 90MHz Pentium had 64 megabytes RAM. The 75 MHz Sparc-20 had The program runs in two modes called regular and speedy. Regular mode has the inner simulation loop coded in an easily readable, easily testable manner. Speedy mode is hand optimized by lifting certain operations out of the inner loops, which would require domain knowledge for a compiler to remove. In addition, the internal heap storage had a debugging_heap flag. When this was turned on, the program took a lot longer because each reference to a data object was first looked up on the buddy list. After it was found, it was looked up linearly across the entire heap to verify that the buddy list software was working. This made it take longer. Multiple runs on both the Pentium and the Sparc varied by up to 2 seconds. THE 75 MHz SPARC 20 REGULAR SPEEDY a. With debugging heap flag turned on. 5 min 32 sec 3 min 32 sec gnatmake -f -g -gnato b. With debugging flag turned off. gnatmake -f -g -gnato 4 min 18 sec 1 min 13 sec c. gnatmake -f -g -gnato -O1 2 min 24 sec 0 min 37 sec d. gnatmake -f -g -gnato -O2 2 min 07 sec 0 min 34 sec e. gnatmake -f -g -gnato -O3 1 min 58 sec 0 min 33 sec f. gnatmake -f -g -O3 1 min 56 sec 0 min 31 sec g. gnatmake -f -O3 1 min 57 sec 0 min 31 sec h. gnatmake -f -O3 -gnatn 1 min 54 sec 0 min 33 sec i. Sparc Ada-83 1 min 44 sec 0 min 51 sec THE 90 MHz PENTIUM REGULAR SPEEDY a. With debugging heap flag turned on. 2 min 54 sec 2 min 01 sec gnatmake -f -g -gnato b. With debugging flag turned off. gnatmake -f -g -gnato 2 min 43 sec 0 min 43 sec c. gnatmake -f -g -gnato -O1 1 min 21 sec 0 min 23 sec d. gnatmake -f -g -gnato -O2 1 min 07 sec 0 min 23 sec e. gnatmake -f -g -gnato -O3 1 min 58 sec 0 min 22 sec f. gnatmake -f -g -O3 1 min 50 sec 0 min 23 sec g. gnatmake -f -O3 1 min 57 sec 0 min 21 sec h. gnatmake -f -O3 -gnatn 1 min 54 sec 0 min 21 sec i. Alsys-83 1 min 18 sec 0 min 56 sec CONCLUSIONS: 1. Before porting an Ada-83 application to Ada-95, you might wish to port it to one or two other Ada-83 compilers to get rid of the compiler dependencies. That way you can factor out the two sources of effort: (a) eliminating compiler dependencies, and (b) porting from compiler independent Ada-83 to compiler independent Ada-95. 2. When you see program_error, immediately suspect that you have to call something from outside the package that you are now calling from the bottom of the package during elaboration time. 3. Wherever possible put pragma pure, pragma preelaborate, pragme elaborate_body in your visible parts. Where needed, put pragma elaborate in your package bodies. 4. Before even running a test, convert all roundings and truncations to Ada-95 rounding and truncation attributes. 5. Although there is a very large space advantage for not using the -g attribute on your compile and bind, the run time penalty appears insignificant. 6. Optimization levels O2, O3, and gnatn may not gain as much execution time as level O1 gained. 7. Although the documentation states that overflow checking (the -gnato option) generates a lot of code, in this example there was not much of an execution time penalty, so for safety, you can probably always compile with both -g and -gnato. 8. When gnatmake says you are up to date and you know you just changed the source code (possibly to an earlier dated source code, in order to switch back ends), just delete the appropriate ALI files and reissue the gnatmake. 9. Learn the b, run, and bt commands of gdb and use them extensively. Gnat out of the box without gdb does not print stack traces when you get an unhandled exception, such as a constraint_error. 10. When you get messages like variable has not been initialized, take it with a grain of salt, because gnat does not always see initializations across generics, separately compiler packages, and some other forms of aliasing. However, investigate those occurrences, as well as all other warning messages. When the warning is accurate, fix the problem. Happy porting.