[bitc-dev] Polymorphism by polyinstantiation -- take 1
Swaroop Sridhar
swaroop at cs.jhu.edu
Thu Jul 27 20:29:28 EDT 2006
Currently, the compiler performs the following (major) passes:
1. Parse
2. Symbol Resolution
3. Type Inference
4. Location semantics Checking
5. Polyinstatiation
6. Closure Conversion and lambda hoisting
7. SSA transformation
8. Backend (currently C) emission
The passes 1-4 shall hereinafter be collectively termed as "frontend
passes." All passes upto the SSA pass (including intermediate
simplification passes) are Bitc AST -> Bitc AST transformations. Even
the output of the SSA pass is a mostly-bitc AST, with some extra
annotations. This has the advantage that we can re-run the
symbol-resolver and type-inference after each pass that does some AST
transformation, and be sure that the compiler did type-correct
transformations. This will also reduce the burden on developers to build
correct type records whenever an AST-change is introduced. This ability
to re-run the resolver and type-checker (hereinafter called the R&T
step) is also heavily used in the implementation of polyinstantiation.
This mail describes how I first implemented an algorithm for
polyinstantiation. In this note, I will handle only the
polyinstantiation of top-level definitions. For example, specializing:
(define (id x) x)
to (fn (bool) bool) and (fn (int32) int32) as:
(define id#fn_bool_bool:(fn (bool) bool)
(lambda (x) x))
(define id#fn_bool_bool:(fn (bool) bool)
(lambda (x) x))
The polyinstantiation of things defined locally using let/letrec is a
little more tricky (given the constraint that we want to always generate
valid Bitc-ASTs, be able to R&T it), and we will take this up as a
separate step.
Here is the algorithm:
1. After all the front-end passes have been run on all interfaces and
source modules (successfully), perform a `link' step. Here, build a new
source module (called big-AST) by appending all top-level forms from all
interfaces and source modules (in that order). While building this
big-AST, the names of all identifiers being defined is replaced with
their fully-qualified names (FQNs) to ensure non-collision.
2. R&T the big-AST. If there are any link-time errors, abort. Else, we
have a properly typed big-AST. We also have an _environment_ where all
the definitions in the entire program exist. This can be used in further
resolution / typing of new definitions introduced during
polyinstantiation. We have now also seen all (typeclass) instance in the
entire program. The instances are required to be globally
non-interfering wrt the entire program.
3. We maintain a `todo' list of things that we want to emit into the
backend. We start from this list and must emit these definitions as well
as those required by them -- transitively -- into the backend.
Initially, this list is something like bitc.main.main, any external
definitions and exceptions, etc. All of the things in initially in the
todo list must have concrete type. Call the procedure `Specialize' (see
below) on each item in the todo list (on its own type) till it becomes
empty.
[We start out by instantiating things in the todo list to its current
type. As we are processing these definitions, we will see specific
usages of polymorphic definitions or typeclass methods, and will
specialize them accordingly.]
[It is probably a good idea now to skip over to the algorithm of the
specializer and return to the main loop of the polyinstantiator after
reading that].
4. After all specializations is done, remove all _polymorphic_ ASTs that
are unreachable. (Polymorphic definitions cannot induce state
changes].
5. Rearrange the big-AST in dependency order so that the use of a type
or definition comes after its definition. Optionally, redundant
declarations may be removed at this step.
6. R&T the big-AST once again to ensure validity and we are done.
The algorithm for Specializer: Specializer takes 2 arguments a (possibly
polymorphic) definition, and a type to which desired instantiation is
desired. [Note that the definition that is being specialized is already
in the big-AST]
i) If the desired definition has been instantiated for the desired
type, just return.
ii) If specialization is requested on a declaration, and if we have a
definition, first specialize that, then emit a declaration based on
the name of the specialized definition.
iii) If the requested specialization is on a union constructor,
specialize the entire union.
iv) If the specialization is on a type-class method, look up the list of
available instances. get the instance and specialize that instead.
Note that this step handles the case where the instance of one
method is satisfied by another method as well.
v) Now that we have a definition to specialize, make a copy of the
definition's AST. Mangle its name based on the name plus the target
type desired (there is a unique mangle for every type).
vi) Clamp the type of this AST to the desired type by introducing a
qualifier.
ex, to instantiate id to (fn (bool) bool), we write:
(define id#FN_4bool_4bool:(fn (bool) bool) (lambda (x) x))
In cases of specializing a type, replace all type arguments with
concrete types.
vii) R&T this definition in the big-AST environment. This will build new
type records for the body of the definition which we can later use
for further instantiations.
viii) Recursively walk the body of this definition and add any other
definitions needed by this body to the todo list. Note that this
must add all definitions used regardless of whether this involves
specialization or not, because they may contain other concrete
uses of other polymorphic definitions that are not exposed by the
outer type.
ix) Finally after the entire AST is specialized again R&T it to make
sure we are fine and add it to the big-AST. After the definition is
processed, fix the use to of this definition (because of which we
started specialization) to refer to the new definition.
Swaroop.
More information about the bitc-dev
mailing list