[bitc-dev] Value types and the Value restriction

Swaroop Sridhar swaroop at cs.jhu.edu
Sun Aug 6 17:15:26 EDT 2006

Jonathan S. Shapiro wrote:
> It would be useful if you could send out a followup note that follows
> the following outline, so that everybody shares a coherent picture:
>   I. The Problem
>      Give an example of something that is screwed up under the
>      current type rules. Explain why it's a problem.

Here is the problem. Consider the following value type:

(defunion (optional 'b):val  none (some it:'b))

and the following definition:

(define p:(mutable bool, 'a) (#t, none))

According to the current scheme (Jacques Garrigue's relaxed value 
restriction for Ocaml), the type of p is:

p: (mutable bool, (optional 'a))
;; note that this is actually the type
;; (forall ('a) (mutable bool, optional 'a)))

However, now if we write:

(let ((r1:(optional int8) p.snd)
        (r2:(optional double) p.snd)) ... )

this will type-check fine, which is clearly not OK because r1 and r2 
have different sizes and this requires the polyinstantiation p.snd and 
thus its container p, which has other mutable components.

Thus, we need additional restrictions to ensure that no mutable location 
  is ever polyinstantiated.

>   II. The Proposal
>      Say what the plan/solution is. What are the new rules?

We will revise Garrigue's restriction to take value types into account. 
We will have a function V2 which takes a type and a mode parameter. The 
mode can be one of `remove' or `keep' depending on whether we come from 
a value type or a reference type context.

GEN(sigma) = FTVS(t) \ FTVS(gamma) \ V2(t, remove)

V2('a, keep) = {}
V2('a, remove) = {'a}
V2((mutable t), mode) =  FTVS(t)   ;; any mode
V2((fn (t1) t2), mode) =  FTVS(t1) U V2(t2, mode)
V2((ref-struct-union t1 ... tn), mode) =
                 V2(t1, keep) U V2(t2, keep) U ... V2(tn, keep)
V2((val-struct-union t1 ... tn)) =
                  V2(t1, remove) U V2(t2, remove) U ... V2(tn, remove)
V2((array t), mode) =  V2(t1, remove)
V2((vector t), mode) = V2(t1, keep)
V2((ref t), mode) = V2(t1, mode)
V2(_, mode) = {}

>   III. Consequences
>      Give some examples that will produce different answers
>      under the old and new rules, so that people can see them.

Let us examine the example we just saw:
(define p:(mutable bool, 'a) (#t, none))

In the OLD scheme, p would get the type:
p: (mutable bool, (optional 'a))
;; that is, (forall ('a) (mutable bool, optional 'a)))

In the NEW scheme, p gets the types:
p: (mutable bool, (optional #X123))
;; p first gets the type for-some-monomorphic 'a,
;;                         (mutable bool, optional 'a)
;; and the 'a is then force to a dummy type as we are at the top-level

Type-variables wrapped in a value type must be removed even if they lie
within an encompassing reference type.

(define p2:(mutable bool, 'b) (#t, (vector none)))

In the OLD scheme, p2 would get the type:
p2: (mutable bool, (vector (optional 'a)))

and we could then write:
(let ((r1:(optional int8) (vector-nth p2.snd 0))
        (r2:(optional double) (vector-nth p2.snd 0))) ... )
which is not acceptable.

In the NEW scheme, p2 gets the type:
p2: (mutable bool, (vector (optional #X345)))

and we are fine.

Type-variables wrapped in a reference type are OK even if they lie
within an encompassing value type. Recall that list is a reference type.

(define p3:(mutable bool, 'c) (#t, (nil, nil)))

In the OLD scheme, and the NEW scheme, p3 gets the type:
p3: (mutable bool, ((list 'a), (list 'b)))

In this sense, we will preserve the useful cases in the Garrigue 
restriction for reference types.

> Finally, can you either define what you meant by "contrapositive
> position" or perhaps use "contravariant position" (which is what I think
> you meant). 

Yes, I meant "contravariant position," sorry.

Note that contravariance has a weird odd/even inversion for
> function parameters, so if you mean contravatiant that's great, but you
> should check whether what you mean is "argument position". For example,
> are there also issues in covariant position?

I think this rule gets this right (as in the original Garrigue rule). In 
the first argument(contravariant) position of a function, we remove all 
type variables without looking further into what the type of the 
argument is:

V2((fn (t1) t2), mode) =  FTVS(t1) U V2(t2, mode)

And, if there is a function type in the in the return position, we will 
again remove any type variables in its (new function's) argument 
position in the next recursive call to V2. These are also contravariant 
positions, because they lie in a contravariant positions within a 
covariant position.

We can call them `argument positions' if you prefer. The intuition is 
that if there is a type-variable within a reference type that is 
unaffected by things that are passed inwards, all possible 
instantiations of this type-variable have the same representation and it 
can thus be safely given a polymorphic type.

This algorithm seems to do the right thing for literals.
(define p:(mutable bool, 'b) (#t, 25))

p: (mutable bool, #X234)
;; and not (forall ((IntLit 'a)) (mutable bool, 'a))

The literal is not given a polymorphic type because we start out with
the remove mode. If this is not done, inconsistencies similar to the
above example will arise.
(see: http://www.coyotos.org/pipermail/bitc-dev/2006-July/000722.html)

However it is customary that the literals should create extra problems.
If we have:
(define q:(mutable bool, 'b) (#t, (vector 25)))

the above algorithm will not do the right thing. The fix is to say that
the type of such literals is understood to be wrapped by a (Integer 'a)
constructor, and add the additional rules:

V2((Integer t), mode) = V2(t1, remove)
V2((Float t), mode) = V2(t1, remove)

This just expressively indicates the unboxed-ness of primitive
integer/floating point types.


More information about the bitc-dev mailing list