[bitc-dev] New closure conversion
Jonathan S. Shapiro
shap at eros-os.org
Sun Jun 25 12:35:20 EDT 2006
On Sun, 2006-06-25 at 00:57 -0400, Swaroop Sridhar wrote:
> Also, for all letrec variables, did you mean that the identifier being
> defined must be heapified, or that it must always live in the closure?
I meant heapified, but I had it wrong, because that will not help. We
*do* need CLOSEALL, but we do not need to heapify letrec-bound
identifiers unless they are closed, and then only if the type requires
conversion. Let me back up and explain the rest of what is going on in
my head. I'm coming to think that we need to introduce a closure type
after all.
Let me talk about a couple of issues in sequence. All of this will be
obvious to anyone who has done this type of thing before, but I have
not.
THE LETREC ISSUE
Let me deal with the LETREC recursion problem first, ignoring the
question of whether we need to closure convert everything. First, note
that the letrec initializer problem only arises for letrec-bound
variables that are actually captured. Currently, all of these are of
(fn...) type. The problem is that we need the closure record allocated
early so that the lambda can close over its own closure record. Here is
the general idea of how to solve the letrec problem:
(letrec ((f (lambda (x:t1 y: t2) returning e:t3)))
body)
becomes (during closure conversion)
(let ((f (allocate-closure cl-struct-type fn-type)))
(set-closure f (cl-struct-type construction)
(lambda (clarg:cl-struct-type x:t1 y:t2)
..body returning e:t3)))
where the type of ALLOCATE_CLOSURE is the type of the original lambda
type, and where fn-type is the original lambda type modified by
injecting the closure argument at position 0.
Basically, the MAKE-CLOSURE internal form that I introduced wasn't the
right thing, and (if we were staying with this design) it would need to
be replaced by ALLOCATE_CLOSURE and SET-CLOSURE.
Note, however, that this doesn't solve the whole problem, and we have
some more work to do here. I will come back to this in my next note.
THE SPARC/MIPS/AMD64 ISSUE
Unfortunately, using this approach isn't going to work. The problem is
that injecting the closure pointer by appending it to the stack does not
work on architectures that use a register-based calling convention.
Actually, I can generally see ways to make it work on all of these, but
not if I have to generate C code for it.
Therefore, a week ago, I concluded that we need to closure convert
everything because of these architectures. For procedures that do not
actually take a closure, we either pass 0 or a dummy argument. In a more
sophisticated compiler, a lot of this could be optimized back out, but
that would require some non-trivial analysis and introduction of some
more types, and I don't think that is a good use of our time right now.
So I think that the right thing to do is to introduce procedure objects.
The real question then becomes: do we need a new type for this? I think
that the answer (regrettably) is "yes", mainly because we are calling
RandT after most passes. Unless we introduce a new type, I don't think
that we are going to be able to check the result closure conversion
adequately.
So: what I was going to propose here is that *before* the closure
conversion pass, everything is typed as ty_fn, but *after* the closure
conversion pass the situation is that:
- The rewritten thunks (the lambdas that now have closure arguments)
are typed as ty_fn. ALL of these are function labels, *not* function
pointers.
- All functions that were introduced by the user and all fields and
variables of function type, are now typed as ty_closure. That is,
wherever the original type was (fn ('a ... 'b) 'c) the new type is
now (closure T (fn (T 'a ... 'b) 'c)).
- Before closure conversion, APPLY forms took the form
(apply fn arg0 .. argN)
- After closure conversion, these take the form:
(cl-apply closure arg0 ... argN)
MULTIFUNCTION CLOSURES
However, if we are going to go to the trouble of introducing closure
types, I would like to take the step of introducing closure records that
might take more than one function. Let me introduce a piece of notation
first. If T is the type (fn ('a ... 'b) 'c) then (cl-rewrite CE T) is
the type (fn (CE 'a ... 'b) 'c), with the intended interpretation that
CE is the type of the closure environment.
With this bit of notation, I can now describe what I think we should do:
1. The type constructor
(__closure fn-type0 ... fn-typeN)
introduces a reference type. It describes a record whose element types
are, respectively
( T (cl-rewrite T fn-type0) ... (cl-rewrite T fn-typeN)) )
Where T is an unspecified reference type that does not escape the
closure record.
You can assign gensym field names if you like.
2. Value constructors for closure types are very weird, because they
take types as arguments:
__make_closure(T (fn (T 'a ... 'b) 'c) ... (fn (T 'd ... 'e) 'f))
allocates, BUT DOES NOT INITIALIZE the appropriate closure structure,
which is of type (__closure fn-type0 ... fn-typeN). Note that this is
only generated from within the compiler. There is no user-available
version of this form.
3. There is a new form SET-CLOSURE:
(set-closure! target:(__closure fn-type0 ... fn-typeN)
cl-arg:T
fn0-arg:(cl-rewrite fn-type0)
.. fnN-arg:(cl-rewrite fn-typeN))
Note that this form is a compiler-internal form, and that it DOES NOT
respect field mutability within the closure, though if it makes the
typing easier we could make the element types of the closure be mutable.
4. During closure conversion, expressions of the form:
(apply fn arg0 .. argN)
Get turned into:
(cl-apply closure 0 arg0 ... argN)
where the '0' is a literal index into the function list associated
with the closure record.
5. During closure conversion, all user-originated field, value, and
procedures of function type (fn ('a ... 'b) 'c) get rewritten into the
corresponding closure type (__closure (fn ('a ... 'b) 'c)).
And now note that my earlier comment about how to fix the letrec problem
for functions will continue to work here.
The reason that I want multiple function arguments here is so that we
can introduce DEFOBJECT in the future -- it's the same basic mechanism.
shap
More information about the bitc-dev
mailing list