[bitc-dev] Type Inference in BitC
swaroop at cs.jhu.edu
Fri Oct 26 15:36:03 EDT 2007
Currently, the type inference algorithm implemented in BitC is a
heuristic one. Due to certain complexities introduced by the
co-existence of unboxed mutability and inferred polymorphism, the
inference algorithm does not infer principal types, but makes a best
effort inference through certain `hinting strategies.'
(for details see http://bitc-lang.org/docs/bitc/PLOS2006.html)
For example, in the expression (let ((p (lambda (x) x))) ... )
the possible sound types of `p' could either be
(1) the polymorphic type (forall ('a) (fn ('a) 'a)), or
(2) the monomorphic type (mutable (fn ('a) 'a)), but not
(3) (forall ('a) (mutable (fn ('a) 'a))).
Due to the lack of precise syntactic/type information, the inference
algorithm will (deterministically) guess the type of p to be either
(1) or (2) based on certain heuristics.
Recently, a new algorithm that infers principal types for BitC programs
was proposed. The idea is to infer types that are polymorphic
_over_ polymorphism and mutability. For example, in the case of the
above expression (let ((p (lambda (x) x))) ... )
The type inferred or p will be:
p: (forall ((copy-compat 'b (fn ('a) 'a))
(PM? k 'b 'b)) 'b)
The predicate (copy-compat 'b (fn ('a) 'a)) means that the type 'b is
constrained to be copy-compatible with (fn ('a) 'a). That is, in order
for this constraint to be satisfied, 'b can (only) unify with
(fn ('a) 'a) or (mutable (fn ('a) 'a)).
The predicate (PM? k 'b 'b) ensures that p can only be resolved to the
types (1) or (2) above, but not (3). The constraint (PM? k t1 t2)
maintains an exclusive-OR between polymorphism and mutability.
Here, k can range over two atoms MUT and IMM. In the body of the let,
when the variable p is used, the forall-elimination rule will give (the
instantiation of) p the type
'd \ (copy-compat 'd (fn ('c) ('c)) (PM? k 'b 'd)
If p is later mutated in the body, the unifier will infer the
constraints: (k = MUT) and ('b = 'd). These constraints will ensure that
two different usages of p cannot instantiate 'b differently (that is, p
cannot be polymorphic).
Similarly, if we accumulate two constraints (PM? k 'b t1) and
(PM? k 'b t2) such that t1 != t2, the unifier will infer the
constraints (k = IMM). If p is later used as a mutable type, we will
obtain the constraint (PM? IMM 'b (mutable t')). This will lead to a
type error since the constraint (PM? IMM t (mutable t')) cannot be
satisfied for any two types t and t'. Therefore, p is here polymorphic
but cannot be mutable.
Intuitively, (PM? k t t') can be thought of as a type-class constraint
that has one mutable instance (PM? MUT t (mutable t)), and an infinite
number of immutable instances (PM? IMM t t1) ... (PM? IMM t tn), where
t, t1, ... tn are not mutable. The actual algorithm requires some more
machinery to get the formalization aligned, but this is the basic idea.
While the new algorithm is complete, the inferred types are more
complicated than the original heuristic algorithm. There is a plan to
integrate this algorithm into the BitC compiler, along with the existing
one. It will be interesting to note how these two algorithms perform in
practice, from a usability point of view.
More information about the bitc-dev