The Safe Project - Compiler

Home – Publications – Compiler – Download – Examples – Try it

The Safe compiler

The certifying compiler for Safe is still under development. Although its front-end has been already implemented and may be considered as stable, its back-ends are still in early stages.

The currently implemented phases of the compiler's front-end are shown in this figure:

Compiler
phases

Brief description for each phase

Lexer / Preprocessor / Parser

It generates an abstract syntax tree (AST) representation from a source program, written in Full-Safe. The preprocessing phase expands the include directives with the corresponding files. This phase is implemented by using standard lexer and parser generating tools, like Alex and Happy.

Renamer / Contextual constraints checker

Several checks related to the scope of identifiers are done at this level. Variable identifers are also renamed in order to avoid name clashes in the subsequent compiler phases.

Hindley-Milner type and region inference

It performs a Hindley-Milner type inference, without regard to explicit destruction. At the same time, algebraic data types are also decorated with regions and region annotations are inserted into the program.

Related papers: WFLP'09

Core-Safe transformer

The decorated AST corresponding to the Full-Safe program is transformed into a Core-Safe representation, preserving type and region annotations.

Sharing analysis

For each subexpression, a set of variables that may point to a (non)recursive descendant of the result is computed. The AST is decorated with the necessary sharing information in order to perform the safe type inference.

Related papers: TFP'06

Destruction analysis

It checks whether the program accepts a valid safe typing and it infers such type. If a type is successfully inferred, then the program is guaranteed not to access dangling pointers at runtime.

Related papers: PPDP'08 LOPSTR'08

Space consumption analysis

It computes an upper bound to the heap and stack memory needs for each function definition. The result is given as a function on the input sizes. Unlike the rest of the compiler, which is mostly implemented in Haskell, this phase has been implemented with the help of the Maple computer algebra system. The Haskell front-end just flattens the expressions of the program into sets of sequences of basic expressions. Then, each of these sequences is translated into an intermediate representation that can be processed by Maple.

Related papers: FOPARA'09

SVM/JVM transformation

This phase performs the actual translation into Java^TM bytecode. It runs in two phases. Firstly, the function definitions are translated into instructions of an imperative abstract machine (SVM - Safe Virtual Machine). Then, these instructions are further translated into bytecode of the JVM.

The correctness of these two phases has been formally certified in Isabelle/HOL. The corresponding theories can be found here:

Related papers: WFLP'08 FMICS'09 TPHOL'09

Security certifier

It generates and Isabelle/HOL script that proves the absence of dangling pointers. In a Proof-carrying Code scenario, the code consumer would receive this proof with the object code in order to verify these safety properties.

Isabelle/HOL theory: Absence of dangling pointers

Related papers: IFM'10

TRS transformer

This phase transforms the translation of Core-Safe code into a Term Rewriting System, so that it can be analysed by some state-of-the-art tools (such as μ-term and AProVE) for proving termination properties of TRS.

Related papers: LOPSTR'08

Bounded memory certification

This phase generates first-order formulae in Tarski's theory of real closed fields, so that it can be verified by the QEPCAD system.

Related papers: FM'11