An Experiment in Reverse Engineering Using Algebraic Specifications*

Nicky Williams Preston
Ecole Normale Superieure de Cachan

Poster Abstract

Much of the software currently in use at Electricite de France, though giving entirely satisfactory results, was written 15 or 20 years ago. However, the replacement of these existing systems by new software re-developed from scratch can be expensive, risky and time-consuming.

As a solution to this problem, we propose a methodological framework for reverse engineering Fortran source code using formal methods. The aim is a description of the software which is very precise but easier to understand than the source code itself and which brings out the aspects which are most important when studying potential re-use. In particular, we wish to discover any assumptions about the domain of application of particular pieces of source code.

We propose the construction of at least two algebraic specifications of the Fortran code. First of all, the low-level ``implementation specification'' is drafted. This must be sufficiently close to the source code to inspire complete confidence that it is a faithful representation. However, by virtue of its structure, by making explicit certain coding conventions, by representing convoluted control structures by equivalent ones which are easier to understand, the implementation specification should be easier to read and understand than the source code itself.

The ``requirements specification'' is then drafted from scratch to describe the code in a far more abstract way. It should be comprehensible to an electrical engineer who is not a software specialist.

The resulting formal description is precise and rigorous. Moreover, the implementation specification can be enriched to show how it corresponds to the requirements specification and it can then be formally proved that the implementation specification satisfies the requirements specification. This enrichment allows us to locate in the source code the implementation of any aspect of the requirements specification which may interest us. Any desired or suspected property of the specifications at any level can also be formally proved.

The construction of specifications at two levels which are then linked by a formal proof is in contrast to other approaches to reverse engineering, in which transformations are carried out on a formal low-level description in order to create the more abstract description. We use the algebraic specification formalism as a rigorous and expressive means to describe and prove what we have discovered in other ways about the code.

We propose to use software tools to restructure and analyze the source code in order to aid the construction of the implementation specification. However, we believe that some human input is essential in the reverse engineering of the trickier parts of Fortran code. It is in studying these parts of the code at least partially by hand that the reverse engineer acquires the understanding which enables him or her to draft a more abstract specification.

As an example, we take a Fortran procedure from a real program used in the control of the French very high voltage electrical power system. Successive transformations of the flowchart of this procedure are first carried out informally. These transformations help us understand that the procedure is in fact a highly specialized implementation of a classic software operation and thus to draft the formal specifications at two levels. Our approach succeeds in highlighting the condition which the application domain must satisfy in order for the procedure to be a correct implementation of the more general operation.

*This work was financed by Electricite de France through the Contrat d'etude R32/1K2924/ER292, No. OT: R32L02

[WCRE Poster Page]

This page generated automatically on Mon Sep 22 09:20:45 HST 1997