[bitc-dev] Libraries and separate compilation
Jonathan S. Shapiro
shap at eros-os.com
Thu Mar 27 15:23:25 EDT 2008
I have been thinking about separate compilation, trying to decide what
to do about it. The current compiler supports a fully static compile,
and it provides a form of "tree shaking" as a side effect of its compile
methodology. But that is a whole-program strategy, and it will not
really work for dynamic libraries.
BitC does not really support separate compilation. We can check a single
unit of compilation up to the point where we know that the rest of the
compile will proceed without error, but there are two problems with
1. Multiple units of compilation may instantiate the same procedure.
If the linker does not support link-once semantics, and the whole
program is not code generated at once, this can lead to code space
It MAY be possible to handle this using gnu.linkonce sections in
binutils ld, but support for link-once behavior seems to be
incomplete even in that linker.
2. At link time, we need to cross-check that incompatible typeclass
instantiations are not present.
Note that violations of this rule are the only errors that can
arise at link time. All other errors are generated when the unit
of compilation is compiled.
For dynamic libraries, the problem is a bit more complicated. Any
polymorphic procedure or type that can (transitively) be instantiated
through the dynamic library interface may be instantiated by the user
with types that were not visible when the dynamic library was compiled.
In some cases we may actually know enough to do all possible
instantiations, but in others we can't.
There seem to be four possible resolutions to this:
1. Include the ASTs (or equivalent) for that stuff in the
dynamic library. Generate the code at dynamic load time
or at run time.
Issue: run-time subsystem complexity.
Issue: LLVM is written in C++, so it would be difficult to ensure
that the runtime is safe. On the other hand, there really
doesn't appear to be any better alternative.
2. Include the ASTs (or equivalent) in the static archive
library that accompanies the dynamic library. Generate
any missing code at static link time or (if the linker
supports link-once semantics) at module compile time.
Issue: This has an unintended consequence: any instance that
is pre-instantiated by the library now becomes part of
the link layer interface contract, because the runtime
system cannot instantiate it at run time. If a future
version of the dynamic library fails to instantiate that
version, dynamic-link-time symbol resolution will fail.
Issue: This approach leads to problems of version drift. If the
implementation in the library changes, the application
binary will contain versions from a library with a different
version from the current library. Depending on the rationale
for the change, this can be a source of bugs.
For this reason, I think that this is not a good approach.
3. Use a compilation strategy in which all argument sizes are
fully parameterized. This is likely to yield low performance.
4. Compile down to a native-code "template" in which sizes and
external references remain unresolved. Implement a low-level
run-time instantiator for these templates.
My current inclination is to say that we should run all of the compiler
correctness checks when we compile each unit of compilation, but we
should then emit an "object file" that contains the resulting AST rather
than object code. We should then perform code generation at link time.
More precisely: we should emit object code at compile wherever we know
that it will later be needed at link time, but we should also emit into
a separate section any ASTs that might require further expansion at
static or dynamic link time. We should then implement a bitc linker
that "thins" the emitted code (implementing link-once) before passing
the result on to the native linker.
More information about the bitc-dev