[bitc-dev] box/unbox take two
Jonathan S. Shapiro
shap at eros-os.org
Tue Feb 15 19:29:12 EST 2005
In discussions over the past two days, Scott has expressed some
discomfort with how box/unbox work. I have come to share some of this
discomfort. In particular, there is a sense in which the relationship
between BOX (the type constructor) and BOX (the value constructor) is
somehow uncomfortably different than the relationship between TUPLE (the
type constructor) and TUPLE (the value constructor).
Part of the confusion is simply that BOX/UNBOX were lousy names -- we
keep shifting between adjective and verb and noun. Part of the confusion
centers on reference types and their relationship to boxing, and part of
the confusion derives from the fact that (box (box x)) is legal while
(unbox (unbox x)) is not.
Given which, I would like to try to re-state this part of the type
system in a different way, and see if the restatement is somehow more
comfortable, and then see if we have merely done a term rotation or
there is some significant difference that has gotten introduced.
In the restated type and value system, the keywords are:
REF STACK-REF DEREF and DUP
Interpretation as Type Constructors:
REF is a type constructor that may only be applied to
a value type, and constructs the corresponding reference
STACK-REF is a type constructor that is only legal when
it appears in a parameter type specification (equivalently:
in the type of a let-bound value). It can only be applied
to a value type and constructs the corresponding reference
type. Values of stack-ref type may not escape and may not
be closed over.
DEREF and DUP are not type constructors. DEREF is not needed
for reasons that will become apparent below.
Interpretation as value constructors:
Given an unboxed value, STACK-REF returns a restricted pointer to
that value. The nature of the restriction is that the (a) restricted
pointer may not be closed over by an escaping lambda, and (b) any
procedure accepting a restricted pointer as an argument type may not
be implemented as a tail recursive call.
This can be further optimized, but I want a simple starting
REF is not a value constructor (at all).
Given either a stack-ref or a ref, DEREF returns the
object pointed to. DEREF may only be applied to a value of
Given a value of value type, DUP returns a reference to a
duplicate of the value in the heap.
(deref (ref X)) === X ; identically-equals
(deref (stack-ref X)) === X
If we reframe the type system and value constructors in this way, then
we also need to look at composite types. For example, given:
(defstruct S a:int32 b:char)
S is a value type (which we have previously called an unboxed type), but
the corresponding value constructor S has type
S: (tuple int32 char) -> (ref S)
that is, it constructs its value in the heap. All I am doing here is
saying the S is the *unboxed* type, because it eliminates some confusion
about the behavior of DEREF. Note that because S is no longer a
reference type, we never need to state a type as being (DEREF s), which
is why DEREF no longer needs to be a type constructor.
The big change in all of this is that
(ref (ref x))
is illegal because ref can only be applied to a value type. This is not
as big a problem as it appears, because we can define
(defstruct (newbox 'a) ptr : 'a)
and then write
(newbox (ref int32)) : (ref (newbox (ref int32)))
this means that we can build deep pointer chains even though the
application of REF is restricted.
This all comes out with a slightly different flavor from our current
system, and it may be cleaner. I will be curious how people react.
More information about the bitc-dev