[bitc-dev] Compilation models
Jonathan S. Shapiro
shap at eros-os.com
Tue Jul 8 16:03:06 CDT 2008
Now that bitc is getting active again, I am re-wrapping my head around
the compiler and re-acquiring issues. One of the major ones is what to
do about the run-time library.
There are two issues impacting this. The first concerns the scope of
typeclass methods. That one appears to have several answers, none of
which are perfect but several of which are probably acceptable in
The larger issue concerns separate compilation. It was our strong desire
to have separate compilation in BitC, and of course that is difficult in
the presence of abstract procedures, because it is impossible to know
all of the concrete instantiations that will be necessary without
looking at the whole program.
We currently do this incrementally at "link" time from the
polyinstantiator, which performs a demand-driven recursive instantiation
of everything that will be necessary to call bitc.main. We can similarly
do this for top-level interactive expressions in an interactive
implementation of BitC. The approach has the implicit effect that a
certain amount of "tree shaking" is performed -- procedures that aren't
actually called never have code generated for them.
All of this makes a bit of a mess for libraries. Basically, we end up
with two types of procedures coming out of a unit of compilation that is
part of a library:
1. Fully concrete procedures. These are procedures where all of the
concrete types are fully known, and if we are willing to live with
the performance of procedure-at-a-time (including inlining)
compilation, we can compile these directly to .o form eagerly.
One catch: a subset of these procedures will be specializations of
abstract procedures. These may end up getting generated
independently by multiple units of compilation, so if we generate
object code eagerly for these we need to be able to rely on some
form of "link once" behavior from the system linker. Or we can
use static emission at a higher code size cost.
2. Procedures that still have abstract types. Code generation for
these must be deferred to link time (either static or dynamic).
We can type check such procedures and run various simplifying
transforms on them, but we can't generate code for them eagerly.
If we are adopting a whole-program compilation model, none of this is a
problem, and we can specialize all of the remaining procedures at whole
program compile (a.k.a. link) time.
If we want to use dynamic linking facilities, however (e.g. for code
reuse), we need the ability to generate code at run time (more
precisely: at dynamic link time). It is desirable to develop this
capability anyway, since that capability can be used to support an
interactive BitC environment.
If we go the bytecode route, we may have a middle position, because we
can generate meta-procedures for each of the non-concrete functions.
This is actually an interesting position, because the only things we can
be abstracted over at the code level are the sizes of types and the
resolution of labels (all of which are either abstract literals or
unresolved procedure references).
Any of these three techniques -- or even a mix of them -- can be made to
work. The essential point is that we either need to do whole-program
compilation or accept some form of run-time code generation.
More information about the bitc-dev