From Software to Hardware and Back

Paul Feautrier, Ecole Normale Supérieure de Lyon

Abstract

One of the techniques for the formal design of embedded systems is -- or should be -- Behavioral Synthesis. Among its many advantages, one can quote easier testing and more complete architecture exploration.

Behavioral Synthesis has many aspects in common with another field, Automatic Parallelization. The reason is that, since von Neuman, software is inherently sequential, while hardware, which belongs to the real world, is parallel. The aim in Automatic Parallelization is to transform a sequential program into an equivalent program, suitable for efficient execution on a parallel high performance system. A specification being given, the aim of Behavioral Synthesis is to generate a VLSI circuit which conforms to the specifications. Most often, specifications come in the form of high-level algorithmic languages, like Matlab, C or Fortran. Hence, the outline of a BS system is: - find parallelism in the specification; - while in AP this paralellism is expressed in a form suitable for execution on a parallel computer (Open MP, MPI, fork-join), here it has to be expressed in a form suitable for synthesis (multiple combinatorial circuits, registers, control automata).

It is striking to notice that the two fields share many concepts, sometime under different names. Ressources, schedules and allocations are common objects, but sometime dependences become races or hazards. If we compare the state of the art in the two fields, one observe that Behavioral Synthesis lags behind Automatic Parallelization in the handling of loops and arrays. This is probably due to technological restrictions. It has not be possible to implement large amounts of memory on a chip until the advent of submicronics technologies.

Classical parallelization aimed only at detecting parallel loops, without changing much of the structure of the original program. It was soon noticed that the amount of parallelism that can be found in this way is limited, and that more aggressive methods are needed. Several authors simultaneously noticed, around 1990, that regular programs can be represented as geometric objects (simple set of points in n-space), and that most methods for finding parallelism are just changes of basis in this space. This new outlook allowed one to extend familiar techniques beyond basic blocks, and gave rise to powerful new methods for scheduling, allocation, memory management and code generation. These methods can be directly applied to Behavioral Synthesis, to replace such makeshift techniques as loop unrolling, loop fusion and loop splitting, strip-mining and loop coalescing. However, there are many unsolved problems, which would greatly benefit Behavioral Synthesis. One of them is scheduling under resource constraints, which is known to be already NP-complete for basic blocs. There are several heuristics which can be applied to loop scheduling, but the results are not satisfactory at present. Similarly, scheduling for a given amount of memory, or, conversely, finding the minimum amount of memory (or registers) to support a given schedule are important problems in synthesis.

Will a good system for Behavioral Synthesis influence software design and implementation? One may notice that general purpose processors are uniformly mediocre for all applications, while specific architectures like vector processors are highly efficient for restricted applications. With the advent of reconfigurable systems (e.g. FPGA) one may be tempted to redesign the architecture according to the needs of each application. The only problem with this idea is that synthesis, especially low level synthesis, is very slow compared to ordinary compilation. Hence, the idea will be limited to stable, high usage programs or to small pieces of big programs.