[bitc-dev] Locations vs Terms
Jonathan S. Shapiro
shap at eros-os.org
Tue Jul 25 02:59:55 EDT 2006
On Tue, 2006-07-25 at 01:53 -0400, Jonathan S. Shapiro wrote:
> The definition of "location" is simultaneously straightforward and
> complicated, and I will come back to that in a subsequent note.
Okay, time to expand on that.
Type classes for literals keep leading us to interesting sources of
confusion. Suppose we treat literals as "expansive" expressions, so that
they are forced to have a particular type. Then we get the following
(let ((x 1))
(+ x 0))
will not type check, because no particular type for "x" can be
determined. We need to insert a type qualifier somewhere.
Suppose we treat literals under the "compile time constant" rule used by
SML, so that they are polymorphic. Then we get the following surprising
(let ((x 1))
(set! x 2:int32)
x:int32) => 1
Why? "x" got assigned the provisional type (IntLit 'a) at the let
binding boundary. The instantiation at the SET! turned into
(IntLit (mutable int32))
[which is okay] and the instantiation at the return value turned into
[which is okay] but notice that we got the wrong value back! What is
going on here is a (very) strange interaction between polymorphism and
To avoid this, we originally made integer and floating point literals
expansive (so that a particular type for X must be chosen). This tended
to reduce polymorphism quite a lot, and we later switched to the
"literals are polymorphic" view -- the surprise above is mostly "solved"
by the resolver hack that checks for (set! ID EXPR) [and therefore
notices that X is mutated, and therefore imposes the value restriction].
In SML, initializers are restrictd to "compile time constants". The SML
"compile time constant" rule amounts to: anything that is bound to the
result of an application must get a single type, and anything that is
mutable must get a single type. The pragmatic result is that only type
constructions can be polymorphic, and then only if they are pure
constants. SML imposed the "compile time constant" rule for two reasons:
1. It was simple, and
2. It handled most cases well in practice.
In BitC we would like to permit richer initializers. This is mainly
because the alternative requires us to make many more things mutable,
and we don't want to encourage mutability. I have a set of rules for
what initializers are legal that I should get out in the next day or so.
The short form is that they must be observably terminating logical
"term": initializer must not (transitively) perform mutation
"terminating": initializer must not run forever. For the moment
I am implementing this by prohibiting recursion.
"observably": the definition of any function called in an
initializer must have been lexically observed (through
import) before the point of use.
But this now raises a puzzle. In general, an observably terminating
logical term is a compile-time computable expression, and such an
expression should be polymorphic (unless somehow restricted). This means
that in the form
(let ((p (+ 1 1)))
there really is no "location" for p, because p can take on several
possible types through polymorphism. It is as though we had written:
(let ((p0 (+ 1:int32 1))
(p1 (+ 1:int16 1)))
Speaking as a systems programmer, I find this behavior very surprising.
I understand why polyinstantiation is occuring here, but as a reader I
keep looking at the "p" and saying to myself "I declared one p and I got
two". That is: my intuitions as a C programmer are all oriented around
locations, and my understanding of the storage behavior is being
violated by the (otherwise desirable) polymorphism.
1. Do other people find this confusing?
2. Some versions of Haskell require that the "p" be explicitly decorated
by a type variable to indicate that it can be polymorphic. Otherwise
it is treated as monomorphic. Does this seem like a good thing? What
should the default be?
More information about the bitc-dev