[bitc-dev] Retrospective Thoughts on BitC

Jonathan S. Shapiro shap at eros-os.org
Fri Mar 23 15:06:41 PDT 2012


By now it will be obvious to everyone that I have stopped work on BitC. An
explanation of *why* seems long overdue.

One answer is that work on Coyotos stopped when I joined Microsoft, and the
work that I am focused on *now *doesn't really require (or seem to benefit
from) BitC. As we all know, there is only so much time to go around in our
lives. But that alone wouldn't have stopped me entirely.

A second answer is that BitC isn't going to work in its current form. I had
hit a short list of issues that required a complete re-design of the
language and type system followed by a ground-up new
implementation. Experience with the first implementation suggested that
this would take quite a while, and it was simply more than I could afford
to take on without external support and funding. Programming language work
is not easy to fund.

But the third answer may of greatest interest, which is that I no longer
believe that type classes "work" in their current form from the standpoint
of language design. That's the only important science lesson here.

In the large, there were four sticking points for the current design:

   1. The compilation model.
   2. The insufficiency of the current type system w.r.t. by-reference and
   reference types.
   3. The absence of some form of inheritance.
   4. The instance coherence problem.

The first two issues are in my opinion solvable, thought the second
requires a nearly complete re-implementation of the compiler. The last
(instance coherence) does not appear to admit any general solution, and it
raises conceptual concerns about the use of type classes for method
overload in my mind. It's sufficiently important that I'm going to deal
with the first three topics here and take up the last as a separate note.

Inheritance is something that people on the BitC list might (and sometimes
have) argue about strongly. So a few brief words on the subject may be
relevant.

*Prefacing Comments on Objects, Inheritance, and Purity*

BitC was initially designed as an [imperative] functional language because
of our focus on software verification. Specification of the typing and
semantics of functional languages is an area that has a *lot* of people
working on it. We (as a field) *kind* of know how to do it, and it was an
area where our group at Hopkins didn't know very much when we started.
Software verification is a known-hard problem, doing it over an imperative
language was already a challenge, and this didn't seem like a good place
for a group of novice language researchers to buck the current trends in
the field. Better, it seemed, to choose our battles. We knew that there
were interactions between inheritance and inference*,* and it *appeared* that
type classes with clever compilation could achieve much of the same
operational results. I therefore decided early *not* to include inheritance
in the language.

To me, as a programmer, the removal of inheritance and objects was a very
reluctant decision, because it sacrificed any possibility of transcoding
the large body of existing C++ code into a safer language. And as it turns
out, you can't really remove the underlying semantic challenges from a
successful systems language. A systems language *requires* some mechanism
for existential encapsulation. The *mechanism* which embodies that
encapsulation isn't really the issue; once you introduce that sort of
encapsulation, you bring into play most of the verification issues that
objects with subtyping bring into play, and once you do that, you might as
well gain the benefit of objects. The remaining issue, in essence, is the
modeling of the Self type, and for a range of reasons it's fairly essential
to have a Self type in a systems language once you introduce encapsulation.
So you end up pushed in to an object type system at some point in *any* case.
With the benefit of eight years of hindsight, I can now say that this is
perfectly obvious!

I'm strongly of the opinion that multiple inheritance is a mess. The
argument pro or con about single inheritance still seems to me to be
largely a matter of religion. Inheritance and virtual methods certainly
aren't the *only* way to do encapsulation, and they may or may not be the
best primitive mechanism. I have always been more interested in getting a
large body of software into a safe, high-performance language than I am in
innovating in this area of language design. If transcoding current code is
any sort of goal, we need something very similar to inheritance.

The last reason we left objects out of BitC initially was purity. I wanted
to preserve a powerful, pure subset language - again to ease verification.
The object languages that I knew about at the time were heavily stateful,
and I couldn't envision how to do a non-imperative object-oriented
language. Actually, I'm *still* not sure I can see how to do that
practically for the kinds of applications that are of interest for BitC.
But as our faith in the value of verification declined, my personal
willingness to remain restricted by purity for the sake of verification
decayed quickly.

The *other *argument for a pure subset language has to do with advancing
concurrency, but as I really started to dig in to concurrency support in
BitC, I came increasingly to the view that this approach to concurrency
isn't a good match for the type of concurrent problems that people are
actually trying to solve, and that the needs and uses for non-mutable state
in practice are a lot more nuanced than the pure programming approach can
address. Pure subprograms clearly play an important role, but they aren't
enough.

And I *still* don't believe in monads. :-)


*Compilation Model*

One of the objectives for BitC was to obtain acceptable performance under a
conventional, static separate compilation scheme. It may be short-sighted
on my part, but complex optimizations at run-time make me very nervous from
the standpoint of robustness and assurance. I understand that bytecode
virtual machines today do very aggressive optimizations with considerable
success, but there are a number of concerns with this:

   - For a robust language, we want to *minimize *the size and complexity
   of the code that is exempted from type checking and [eventual]
   verification. Run-time code is excepted in this fashion. The garbage
   collector taken alone is already large enough to justify assurance
   concerns. Adding a large and complex optimizer to the pile drags the
   credibility of the assurance story down immeasurably.
   - Run-time optimization has very negative consequences for startup times
   - especially in the context of transaction processing. Lots of hard data on
   this from IBM (in DB/2) and others. It is one of the reasons that "Java in
   the database" never took hold. As the frequency of process and component
   instantiation in a system rises, startup delays become more and more of a
   concern. Robust systems don't recycle subsystems.
   - Run-time optimization adds a *huge* amount of space overhead to the
   run-time environment of the application. While the *code* of the
   run-time compiler can be shared, the *state* of the run-time compiler
   cannot, and there is quite a lot of that state.
   - Run-time optimization - especially when it is done "on demand" -
   introduces both variance and unpredictability into performance numbers. For
   some of the applications that are of interest to me, I need "steady state"
   performance. If the code is getting optimized on the fly such that it
   improves by even a modest constant factor, real-time scheduling starts to
   be a very puzzling challenge.
   - Code that is *produced* by a run-time optimizer is difficult to share
   across address spaces, though this probably isn't solved very well by *
   other* compilation approaches models either.
   - If run-time optimization is present, applications will come to rely on
   it for performance. That is: for social reasons, "optional" run-time
   optimization tends to quickly become required.

To be clear, I'm *not* opposed to continuous compilation. I actually think
it's a good idea, and I think that there are some fairly compelling
use-cases. I *do* think that the run-time optimizer should be implemented
in a strongly typed, safe language. I *also* think that it took an awfully
long time for the hotspot technology to stabilize, and that needs to be
taken as a cautionary tale. It's also likely that many of the
problems/concerns that I have enumerated can be solved - but probably not *
soon*. For the applications that are most important to me, the concerns
about assurance are primary. So from a language design standpoint, I'm
delighted to exploit continuous compilation, but I don't want to design a
language that *requires* continuous compilation in order to achieve
reasonable baseline performance.

The optimizer complexity issue, of course, can be raised just as seriously
for conventional compilers. You are going to optimize *somewhere*. But my
experience with dynamic translation tells me that it's a lot easier to do
(and to reason about) one thing at a time. Once we have a high-confidence
optimizer in a safe language, *then* it may make sense to talk about
integrating it into the run-time in a high-confidence system. Until then,
separation of concerns should be the watch-word of the day.

Now strictly speaking, it should be said that run-time compilation actually
isn't necessary for BitC, or for any other bytecode language. Run-time
compilation doesn't become necessary until you combine run-time loading
with compiler-abstracted representations (see below) and allow types having
abstracted representation to appear in the signatures of run-time loaded
libraries. Until then it is possible to maintain a proper phase separation
between code generation and execution. Read on - I'll explain some of that
below.

In any case, I knew going in that strongly abstracted types would raise
concerns on this issue, and I initially adopted the following view:

   - Things like kernels can be whole-program compiled. This effectively
   eliminates the run-time optimizer requirement.
   - Things like critical system components want to be statically linked
   anyway, so they can *also* be dealt with as whole-program compilation
   problems.
   - For everything else, I hoped to adopt a kind of "template expansion"
   approach to run-time compilation. This wouldn't undertake the full
   complexity of an optimizer; it would merely extend run-time linking and
   loading to incorporate span and offset resolution. It's still a lot of
   code, but it's not horribly *complex* code, and it's the kind of thing
   that lends itself to rigorous - or even formal - specification.

It took several years for me to realize that the template expansion idea
wasn't going to produce acceptable baseline performance. The problem lies
in the interaction between abstract types, operator overloading, and
inlining.

*Compiler-Abstracted Representations vs. Optimization*

Types have representations. This sometimes seems to make certain members of
the PL community a bit uncomfortable. A thing to be held at arms length.
Very much like a zip-lock bag full of dog poo (insert cartoon here). From
the perspective of a systems person, I regret to report that where the bits
are placed, how big they are, and their assemblage actually *does* matter.
If you happen to be a dog owner, you'll note that the "bits as dog poo"
analogy is holding up well here. It seems to be the lot of us systems
people to wade daily through the plumbing of computational systems, so
perhaps that shouldn't be a surprise. Ahem.

In any case, the PL community set representation issues aside in order to
study type issues first. I don't think that pragmatics was forgotten, but I
think it's fair to say that representation issues are not a focus in
current, mainstream PL research. There is even a school of thought that
views representation as a fairly yucky matter that should be handled in the
compiler "by magic", and that imperative operations should be handled that
way too. For systems code that approach doesn't work, because a lot of the
representations and layouts we need to deal with are dictated to us by the
hardware.

In any case, types *do* have representations, and knowledge of those
representations is utterly essential for even the simplest compiler
optimizations. So we need to be a bit careful not to abstract types* too *
successfully*,* lest we manage to break the compilation model.


In C, the "+" operator is primitive, and the compiler can always select the
appropriate opcode directly. Similarly for other "core" arithmetic
operations. Now try a thought experiment: suppose we take every use of such
core operations in a program and replace each one with a functionally
equivalent procedure call to a runtime-implemented intrinsic. You only have
to do this for *user* operations - addition introduced by the compiler to
perform things like address arithmetic is always done on concrete types, so
those can still be generated efficiently. But even though it is only done
for user operations, this would clearly harm the performance of the program
quite a lot. You *can* recover that performance with a run-time optimizer,
but it's complicated.

In C++, the "+" operator can be overloaded. But (1) the bindings for
primitive types cannot be replaced, (2) we know, statically, what the
bindings and representations *are* for the other types, and (3) we can
control, by means of inlining, which of those operations entail a procedure
call at run time. I'm not trying to suggest that we want to be forced to
control that manually. The key point is that the compiler has enough
visibility into the implementation of the operation that it is possible to
inline the primitive operators (and many others) at static compile time.

Why is this possible in C++, but not in BitC?

In C++, the instantiation of an abstract type (a template) occurs in an
environment where complete knowledge of the representations involved is
visible to the compiler. That information may not all be in scope to the
programmer, but the compiler can chase across the scopes, find all of the
pieces, assemble them together, and understand their shapes. This is what
induces the "explicit instantiation" model of C++. It also causes a lot of
"internal" type declarations and implementation code to migrate into header
files, which tends to constrain the use of templates and increase the
number of header file lines processed for each compilation unit - we
measured this at one point on a very early (pre templates) C++ product and
found that we processed more than 150 header lines for each "source" line.
The ratio has grown since then by at least a factor of ten, and (because of
templates) quite likely 20.

It's all rather a pain in the ass, but it's what makes static-compile-time
template expansion possible. From the *compiler* perspective, the types
involved (and more importantly, the representations) aren't abstracted at
all. In BitC, *both* of these things *are* abstracted at static compile
time. It isn't until link time that all of the representations are in hand.

Now as I said above, we can imagine extending the linkage model to deal
with this. All of that header file information is supplied to deal with *
representation* issues, not type checking. Representation, in the end,
comes down to sizes, alignments, and offsets. Even if we don't know the
concrete values, we *do* know that all of those are compile-time constants,
and that the results we need to compute at compile time are entirely formed
by sums and multiples of these constants. We could imagine dealing with
these as *opaque* constants at static compile time, and filling in the
blanks at link time. Which is more or less what I had in mind by link-time
template expansion. Conceptually: leave all the offsets and sizes "blank",
and rely on the linker to fill them in, much in the way that it handles
relocation.

The problem with this approach is that it removes key information that is
needed for optimization and registerization, and it doesn't support
inlining. In BitC, we can *and do* extend this kind of instantiation all
the way down to the primitive operators! And perhaps more importantly, to
primitive accessors and mutators. The reason is that we want to be able to
write expressions like "a + b" and say "that expression is well-typed
provided there is an appropriate resolution for +:('a,'a)->'a". Which is a
fine way to *type* the operation, but it leaves the representation of 'a
fully abstracted. Which means that we cannot see when they are primitive
types. Which means that we are *exactly* (or all too often, in any case)
left in the position of generating *all* user-originated "+" operations as
procedure calls. Now surprisingly, that's actually not the end of the
world. We can imagine inventing some form of "high-level assembler" that
our static code generator knows how to translate into machine code. If the
static code generator does this, the run-time loader can be handed
responsibility for emitting procedure calls, and can substitute intrinsic
calls at appropriate points. Which would cause us to lose code sharing, but
that might be tolerable on non-embedded targets.

Unfortunately, this kind of high-level assembler has some fairly nasty
implications for optimization: First, we no longer have any idea what the *
cost* of the "+" operator is for optimization purposes. We don't know how
many cycles that particular use of + will take, but more importantly, we
don't know how many bytes of code it will emit. And without that
information there is a very long list of optimization decisions that we can
no longer make at static compile time. Second, we no longer have enough
information at static code generation time to perform a long list of
*basic* register
and storage optimizations, because we don't know which procedure calls are
actually going to use registers.

That creaking and groaning noise that you are hearing is the run-time code
generator gaining weight and losing reliability as it grows. While the
impact of this mechanism actually wouldn't be as bad as I am sketching -
because a lot of user types *aren't* abstract - the *complexity* of the
mechanism really is as bad as I am proposing. In effect we end up deferring
code generation and optimization to link time. That's an idea that goes
back (at least) to David Wall's work on link time register optimization in
the mid-1980s. It's been explored in many variants since then. It's a
compelling idea, but it has pros and cons.

What is going on here is that types in BitC are *too* successfully
abstracted for static compilation. The result is a rather *large* bag of
poo, so perhaps the PL people are on to something.:-)

*Two Solutions*
*
*
The most obvious solution - adopted by C++ - is to redesign the language so
that representation issues are not hidden from the compiler. That's
actually a solution that is worth considering. The problem in C++ isn't so
much the number of header file lines per source line as it is the fact that
the C preprocessor requires us to process those lines *de novo* for each
compilation unit. BitC lacks (intentionally) anything comparable to the C
preprocessor.

The other possibility is to shift to what might be labeled "install time
compilation". Ship some form of byte code, and do a static compilation at
install time. This gets you back all of the code sharing and optimization
that you might reasonably have expected from the classical compilation
approach, it opens up some interesting design point options from a systems
perspective, and (with care) it can be retrofitted to existing systems.
There are platforms today (notably cell phones) where we basically do this
already.

The design point that you don't want to cross here is dynamic *loading* where
the loaded interface carries a type with an abstracted representation. At
that point you are effectively committing yourself to run-time code
generation*,* though I do have some ideas on how to mitigate that.

*Conclusion Concerning Compilation Model*

*If static, separate compilation is a requirement, it becomes necessary for
the compiler to see into the source code across module boundaries whenever
an abstract type is used. That is: any procedure having abstract type must
have an exposed source-level implementation.*
*
*
*The practical alternative is a high-level intermediate form coupled with
install-time or run-time code generation. That is certainly feasible, but
it's more that I felt I could undertake.*


That's all manageable and doable. Unfortunately, it isn't the path we had
taken on, so it basically meant starting over.


*Insufficiency of the Type System*
*
*
At a certain point we had enough of BitC working to start building library
code. It may not surprise you that the first thing we set out to do in the
library was IO. We found that we couldn't handle typed input within the
type system. Why not?

Even if you are prepared to do dynamic allocation within the IO library,
there is a level of abstraction at which you need to implement an operation
that amounts to "inputStream.read(someObject: ByRef mutable 'a)" There are
a couple of variations on this, but the point is that you want the ability
at some point to move the incoming bytes into previously allocated storage.
So far so good.

Unfortunately, in an effort to limit creeping featurism in the type system,
I had declared (unwisely, as it turned out) that the only place we needed
to deal with ByRef types was at parameters. Swaroop took this statement a
bit more literally than I intended. He noticed that if this is *really* the
only place where ByRef needs to be handled, then you can internally treat
"ByRef 'a" as 'a, merely keeping a marker on the parameter's identifier
record to indicate that an extra dereference is required at code generation
time. Which is actually quite clever, except that it doesn't extend well to
signature matching between type classes and their instances. Since the
argument type for *read* is *ByRef 'a*, InputStream is such a type class.

So now we were faced with a couple of issues. The first was that we needed
to make ByRef 'a a first-class type within the compiler so that we could
unify it, and the second was that we needed to deal with the implicit
coercion issues that this would entail. That is: conversion back and forth
between ByRef 'a and 'a at copy boundaries. The coercion part wasn't so
bad; ByRef is never inferred, and the type coercions associated with ByRef
happen in exactly the same places that const/mutable coercions happen. We
already had a cleanly isolated place in the type checker to deal with that.

But even if ByRef isn't inferred, it can propagate through the code by
unification. And *that* causes safety violations! The fact that ByRef was
syntactically restricted to appear only at parameters had the (intentional)
consequence of ensuring that safety restrictions associated with the
lifespan of references into the stack were honored - that was why I had
originally imposed the restriction that ByRef could appear only at
parameters. Once the ByRef type can unify, the syntactic restriction no
longer guarantees the enforcement of the lifespan restriction. To see why,
consider what happens in:

  define byrefID(x:ByRef 'a) { return x; }

Something that is *supposed* to be a downward-only reference ends up
getting returned up the stack. Swaroop's solution was clever, in part,
because it silently prevented this propagation problem. In some sense, his
implementation doesn't really treat ByRef as a type, so it can't propagate.
But *because *he didn't treat it as a type, we also couldn't do the
necessary matching check between instances and type classes.

It turns out that being able to do this is *useful*. The essential
requirement of an abstract mutable "property" (in the C# sense) is that we
have the ability within the language to construct a function that returns
the *location* of the thing to be mutated. That location will often be on
the stack, so returning the location is *exactly* like the example above.
The "ByRef only at parameters" restriction is actually very conservative,
and we knew that it was preventing certain kinds of things that we
eventually wanted to do. We had a vague notion that we would come back and
fix that at a later time by introducing region types.

As it turned out, "later" had to be "now", because region types are the
right way to re-instate lifetime safety when ByRef types become first
class. But *adding* region types presented two problems (which is why we
had hoped to defer them):

   - Adding region types meant rewriting the type checker and re-verifying
   the soundness and completeness of the inference algorithm, *and*
   - It wasn't just a re-write. Regions introduce subtyping. Subtyping and
   polymorphism don't get along, so we would need to go back and do a lot of
   study.

Region polymorphism with region subtyping had certainly been done before,
but we were looking at subtyping in another case too (below). That was
pushing us toward a kinding system and a different type system.

So to fix the ByRef problem, we very nearly needed to re-design both the
type system and the compiler from scratch. Given the accumulation of cruft
in the compiler, that might have been a good thing in any case, but Swaroop
was now full-time at Microsoft, and I didn't have the time or the resources
to tackle this by myself.

*Conclusion Concerning the Type System*

*In retrospect, it's hard to imagine a strongly typed imperative language
that doesn't type locations in a first-class way. If the language
simultaneously supports explicit unboxing, it is effectively forced to deal
with location lifespan and escape issues, which makes memory region typing
of some form almost unavoidable.*
*
*
*For this reason alone, even if for no other, the type system of an
imperative language with unboxing must incorporate some form of subtyping.
To ensure termination, this places some constraints on the use of type
inference. On the bright side, once you introduce subtyping you are able to
do quite a number of useful things in the language that are hard to do
without it.*



*Inheritance and Encapsulation*
*
*
Our first run-in with inheritance actually showed up in the compiler
itself. In spite of our best efforts, the C++ implementation of the BitC
compiler had not entirely avoided inheritance, so it didn't have a direct
translation into BitC. And even if we changed the code of the compiler,
there are a large number of third-party libraries that we would like to be
able to transcode. A good many of those rely on [single] inheritance.
Without having at least some form of interface (type) inheritance, We can't
really even do a good job interfacing to those libraries as foreign objects.

The compiler aside, we also needed a mechanism for encapsulation. I had
been playing with "capsules", but it soon became clear that capsules were
really a degenerate form of subclassing, and that trying to duck that issue
wasn't going to get me anywhere.

I could nearly imagine getting what I needed by adding "ThisType" and
inherited *interfaces*. But the combination of those two features
introduces subtyping. In fact, the combination is equivalent (from a type
system perspective) to single-inheritance subclassing.

And the more I stared at interfaces, the more I started to ask myself why
an interface wasn't just a type class. *That* brought me up against the
instance coherence problem from a new direction, which was already making
my head hurt. It also brought me to the realization that Interfaces work,
in part, because they are always parameterized over a single type (the
ThisType) - once you know that one, the bindings for all of the others are
determined by type constructors or by explicit specification.

And introducing SelfType was an even bigger issue than introducing
subtypes. It means moving out of System F<: entirely, and into the object
type system of Cardelli *et al*. That wasn't just a matter of
re-implementing the type checker to support a variant of the type system we
already had. It meant re-formalizing the type system entirely, and learning
how to think in a different model.

Doable, but time not within the framework or the compiler that we had
built. At this point, I decided that I needed to start over. We had learned
a lot from the various parts of the BitC effort, but sometimes you have to
take a step back before you can take more steps forward.


*Instance Coherence and Operator Overloading*

BitC largely borrows its type classes from Haskell. Type classes aren't
just a basis for type qualifiers; they provide the mechanism for *ad
hoc*polymorphism. A feature which, language purists notwithstanding,
real
languages actually do need.

The problem is that there can be multiple type class instances for a given
type class at a given type. So it is possible to end up with a function
like:

define f(x : 'x) {
  ...
  a:int32 + b  // typing fully resolved at static compile time
  return x + x  // typing not resolvable until instantiation
}


Problem: we don't know which instance of "+" to use when 'x instantiates to
*int32*. In order for "+" to be meaningful in a+b, we need a
static-compile-time resolution for +:(int32, int32)->int32. And we get that
from Arith(int32). So far so good. But if 'x is instantiated to *int32*, we
will get a type class instance supplied by the caller. The problem is that
there is no way to guarantee that this is the *same* instance of
Arith(int32) that we saw before.

The solution in Haskell is to impose the *ad hoc* rule that you can only
instantiate a type class once for each unique type tuple in a given
application. This is similar to what is done in C++: you can only have one
overload of a given global operator at a particular type. If there is more
than one overload at that type, you get a link-time failure. This
restriction is tolerable in C++ largely because operator overloading is so
limited:

   1. The set of overloadable operators is small and non-extensible.
   2. Most of them can be handled satisfactorily as methods, which makes
   their resolution unambiguous.
   3. Most of the ones that *can't* be handled as methods are arithmetic
   operations, and there are practical limits to how much people want to
   extend those.
   4. The remaining highly overloaded global operators are associated with
   I/O. These *could* be methods in a suitably polymorphic language.

In languages (like BitC) that enable richer use of operator overloading, it
seems unlikely that these properties would suffice.

But in Haskell and BitC, overloading is extended to *type properties* as
well. For example, there is a type class "Ord 'a", which states whether a
type 'a admits an ordering. Problem: most types that admit ordering admit
more than one! The fact that we know an ordering *exists* really isn't
enough to tell us which ordering to *use*. And we can't introduce
*two* orderings
for 'a in Haskell or BitC without creating an instance coherence problem.
And in the end, the instance coherence problem exists because the language
design performs method resolution in what amounts to a non-scoped way.

But if nothing else, you can hopefully see that the heavier use of
overloading in BitC and Haskell places much higher pressure on the "single
instance" rule. Enough so, in my opinion, to make that rule untenable. And
coming from the capability world, I have a strong allergy to things that
smell like ambient authority.

Now we can get past this issue, up to a point, by imposing an arbitrary
restriction on where (which compilation unit) an instance can legally be
defined. But as with the "excessively abstract types" issue, we seemed to
keep tripping on type class issues. There are other problems as well when
multi-variable type classes get into the picture.

At the end of the day, type classes just don't seem to work out very well
as a mechanism for overload resolution without some other form of support.

A second problem with type classes is that you can't resolve operators at
static compile time. And if instances are explicitly named, references to
instances have a way of turning into first-class values. At that point the
operator reference can no longer be statically resolved at all, and we have
effectively re-invented operator methods!

*Conclusion about Type Classes and Overloading*:

*The type class notion (more precisely: qualified types) is seductive, but
absent a reasonable approach for instance coherence and lexical resolution
it provides an unsatisfactory basis for operator overloading. There is a
disturbingly close relationship between type class instances and object
instances that needs further exploration by the PL community. The important
distinction may be pragmatic rather than conceptual: type class instances
are compile-time constants while object instances are run-time values. This
has no major consequences for typing, but it leads to significant
differences w.r.t. naming, binding, and [human] conceptualization.*
*
*
*There are unresolved formal issues that remain with multi-parameter type
classes. Many of these appear to have natural practical solutions in a
polymorphic object type system, but concerns of implementation motivate
kinding distinctions between boxed and unboxed types that are fairly
unsatisfactory.*



*Wrapping Up*

The current outcome is *extremely* frustrating. While the blind spots here
were real, we were driven by the requirements of the academic research
community to spend nearly three years finding a way to do complete
inference over mutability. That was an enormous effort, and it delayed our
recognition that we were sitting on the wrong kind of underlying type
system entirely. While I continue to think that there is some value in
mutability inference, I think it's a shame that a fairly insignificant wart
in the original inference mechanism managed to prevent larger-scale success
in the overall project for what amount to *political* reasons. If not for
that distraction, I think we would probably have learned enough about the
I/O and the instance coherency issues to have moved to a different type
system while we still had a group to do it with, and we would have a
working and useful language today.

The distractions of academia aside, it is fair to ask why we weren't
building small "concept test" programs as a sanity check of our design.
There are a number answers, none very satisfactory:

   - Research languages can adopt simplifications on primitive types
   (notably integers) that systems languages cannot. That's what pushed us
   into type classes in the first place, we new that polymorphism over unboxed
   types hadn't seen a lot of attention in the literature, and we knew that
   mutability inference had never been done. We had limited manpower, so we
   chose to focus on those issues first.
   - We knew that parametric polymorphism and subtyping didn't get along,
   so we wanted to avoid that combination. Unfortunately, we avoided subtypes
   too well for too long, and they turned out to be something unavoidable.
   - For the first several years, we were very concerned with software
   verification, which *also *drove us strongly away from object-based
   languages and subtyping. That blinded us.
   - Coming to language design as "systems" people, working in a department
   that lacked deep expertise and interest in type systems, there was an
   enormous amount of subject matter that we needed to learn. Some of the
   reasons for our failure are "obvious" to people in the PL community, but
   others are not. Our desire for a "systems" language drove us to explore the
   space in a different way and with different priorities than are common in
   the PL community.

I think we did make some interesting contributions. We now know how to do
(that is: to implement) polymorphism over unboxed types with significant
code sharing, and we understand how to deal with inferred mutability. Both
of those are going to be very useful down the road. We have also learned a
great deal about advanced type systems.

In any case, BitC in its current form clearly needs to be set aside and
re-worked. I have a fairly clear notion about how I would approach *
continuing* this work, but that's going to have to wait until someone is
willing to pay for all this.




Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.coyotos.org/pipermail/bitc-dev/attachments/20120323/fc89c80a/attachment-0001.html 


More information about the bitc-dev mailing list