[bitc-dev] Rust, GC, and language politics

Jonathan S. Shapiro shap at eros-os.org
Sat Jul 27 09:59:17 PDT 2013


So re-reading the Removing Garbage Collection From the Rust
Language<http://pcwalton.github.io/blog/2013/06/02/removing-garbage-collection-from-the-rust-language/>
blog
entry, I don't see anything in the entry to suggest that there is an
intrinsic objection to GC. Let's go through the objections one at a time:

1. *Distinction between heap and stack is hard to grasp*. Umm. That's a
hell of an indictment of computer science programs, but not a criteria for
language design. I (obviously) agree that prescriptive stack allocation is
required, but that's completely orthogonal to the method of heap storage
management. This statement is useful context, but not directly on point.

2. *Sigils make the code unfamiliar before the concepts are learned*. I'm
not a Rust user, so I can't say from personal experience whether the sigils
would be an impediment to me. In a general sort of way, I'd say that sigils
(by which they seem to mean punctuation keywords) are harder to understand
than identifier-style keywords, and that it's hard to distinguish between
the complexity induced by an awkward presentation and the inherent
complexity of the pointer types. To make this concrete:

   let owned_box : ~Point       vs       let owned_box : owned Point

I am inclined to think, with no supporting evidence whatsoever, that simply
shifting to English keywords would make a world of difference.

But more importantly, shifting most of this burden to the compiler would
make an even bigger difference, and in most cases the compiler is where it
belongs.

3. *There are two heaps, not just one, so beginners are confused as to
which one to allocate into.* Well, yes. That really *is* confusing. And
it's compounded by the fact that (a) while you want a keyword for
prescription, discovery of unique pointers can be entirely automatic, and
(b) Rust advocates the wrong default (owned pointer), and (c) for most
purposes it's completely unnecessary. Owned pointer is the wrong default
because we teach general-purpose programming, and the semantics of an owned
pointer isn't general purpose.

Ultimately, I think this reflects a core misunderstanding of what a
"systems language" *is*. I'll say more about that below.

4. *Programmers don’t know which to use, since some operations are
available with ~ and some operations are available with @*. Well hmm. It's
not clear to me what operations aren't supported on both pointers, unless
the missing operation is pointer escape. And this is a great example of why
defaulting to a non-general type is almost always a bad idea. It's also a
great example of why a language should *support* prescription without *
requiring* prescription.

I think that the value of borrowed (non-escaping) pointers in libraries is
underrated.  That said, I think that a lot of the problem is that they
introduced a half-assed region system and therefore were stuck trying to
explain a memory model that consists entirely of corner cases rather than
an integrative whole. Corner cases are much harder to explain.

Within point 4, there is a key mis-statement:

The key insight that was missing is that *the owning pointer ~ is just the
Rust equivalent of malloc and free*.


This statement is flatly wrong, because the two aren't equivalent at all.
Rust's owning pointer is a degenerate region allocation (degenerate because
they don't - or at least didn't - have a proper region system). That's not
the semantics of malloc/free at all.



So with those four points examined, the thing to notice is that *not a
single statement is made here explaining why GC is undesirable*. *Why* should
owned pointers be "the go-to solution" in Rust? Especially when we know
that such pointers lead to false liveness! I'm having trouble finding the
reference at the moment, but there was a paper or a tech report by Andrew
Appel examining the impact of false liveness (pointers you are done with
but are still in scope) in precise collectors. From memory, I think the *
typical* false retention overhead was around 17%-20%, and in some cases
where event loop programs retain a top-level pointer it was effectively
100%. Owned pointers should be expected to have similar retention problems,
though at least without the consequent collector overheads.

What isn't being said here is that they are being bit in the ass by GC
performance. Given that they are committed to the LLVM back end, which has *
horrible* support for GC, and the GC implementation was done by a beginner
as a summer project, that isn't really all that surprising. The absence of
GC support in LLVM was the main reason I didn't adopt LLVM for BitC.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.coyotos.org/pipermail/bitc-dev/attachments/20130727/4d6777e1/attachment.html 


More information about the bitc-dev mailing list