Research - Rocket


The Rocket compiler is a highly-optimizing compiler retargetable for a broad class of ILP architectures. Rocket is implemented in C++. Current targets include three commercial superscalar architectures (the Alpha 21164 Microprocessor, the IBM RS6000 computer and a computer based on an Intel i860 chip) as well as several hypothetical LIW processors, and a family of hypothetical superscalar processors. Rocket was designed and initially implemented as a C compiler. However, due to the small emphasis placed on transformations within the parser, it is relatively easy to add an additional front-end for a different language. In fact this has been done, and Rocket currently includes both C and FORTRAN parsers.

To generate excellent code, an optimizing compiler normally uses several phases, typically representing the C program in different forms during the each phase. When compiling for ILP targets, all the phases necessary when compiling for traditional architectures are still required. In addition, we need phases for parallelization of intermediate code and for instruction scheduling.

During translation into highly-optimized code for ILP architectures, Rocket executes a number of distinct phases. First, parsing produces an abstract representation of the input C or FORTRAN program. Rocket uses the lcc compiler as a C parser, building an intermediate form from lcc's code generator. Rocket's current FORTRAN parser comes from the ParaScope programming environment.

Next, Global Analysis and Optimization is performed on the abstract intermediate representation built by the parser. To adequately do both traditional optimization and instruction scheduling, a compiler must perform extensive dataflow analysis of the intermediate form(s) the compiler generates. Rocket includes analysis that computes local and global dataflow, symbolic cover analysis based upon Reif and Tarjan's fast covers algorithm, live variable analysis required by register assignment, and data dependence analysis necessary in the parallelization process included in code selection.

In addition, to more accurately determine the program's actual data dependences, Rocket includes memory reference disambiguation for array and pointer references. This disambiguation method (based upon the analysis of diophantine linear equations) is well summarized in John Ellis' 1995 dissertation.

Rocket includes traditional optimizations described in any modern compiler book, including:

  • common subexpression elimination,
  • copy propagation, constant folding,
  • constant propagation,
  • algebraic simplification,
  • reduction in strength, and
  • dead code removal.

Further, Rocket actually makes use of two abstract intermediate forms. Code Selection replaces the abstract representations of an intermediate form with collections of machine operations. In compilers for traditional, non-ILP architectures, code selection can generate the final form of the compilation–either assembly language or binary code, depending upon the compiler. In an ILP compiler such as Rocket, however, we require more. The product of code selection in Rocket is a Data Dependence DAG (DDD) for each basic block of the control flow graph for the function being compiled.

Instruction Scheduling assigns machine operations to instructions to the satisfy data-dependence and machine-resource constraints of the DDD built by code selection. It is the need for such an instruction scheduling phase that distinguishes compilers for ILP architectures from those for more traditional computers.

Rocket uses graph-coloring register assignment. Rocket performs register assignment in one of three places during compilation,

  • immediately before instruction scheduling,
  • immediately after instruction scheduling, or
  • in an iterative manner within the compiler’s instruction scheduling phase.

In addition to "standard" graph coloring, Rocket supports register assignment in an environment with multiple register files, as in TI's C6000 series of DSP chips.

While Rocket proved to be an effective tool for research in register assignment and instruction scheduling (see link to publications), neither Rocket's parser or intermediate optimizations proved robust enough for experimentation in today's research community. To overcome these limitations, our next-generation ILP compiler, Arpeggio, will combine the Scale compiler's parsing and intermediate optimization and analysis with Rocket's register assignment, instruction scheduling and retargetability.