[bitc-dev] Retrospective Thoughts on BitC

David Jeske davidj at gmail.com
Mon Apr 9 17:54:20 PDT 2012


On Mon, Apr 9, 2012 at 4:11 PM, Jonathan S. Shapiro <shap at eros-os.org>wrote:

> On Mon, Apr 9, 2012 at 3:32 AM, David Jeske <davidj at gmail.com> wrote:
>
>> The reason I sought out BitC is that I'm looking for solutions not to
>> program-expression, but to software modularity....
>>
>
> The two are not entirely separable. One requirement for modularity is that
> the language does not enforce overspecialization through insufficient
> expressiveness.
>

Agreed.


> Some observations...
>>
>> 2) type-inference doesn't fix this... type inference systems are merely a
>> technique for software brevity
>>
>
> I don't entirely agree. Type inference is relevant because programmers are
> notoriously bad at specifying "most general" types.
>

True. Sather's (non-inference based) notion that one can always implement a
type-compatible object without implementation inheritence (every type is an
interface) seemed like it goes further than rigid implementation
inheritence class models. However, it does seem we can go further still
with structural compatibilty inference (like Google Go is
attempting) rather than named-type compatibility. I don't know if the they
have a model for separate compilation and upgradability, or if one is
possible.


> IMO, if we want to move past C-shared-libraries, the solution we need is a
>> model for type-safe high-performance loadable software components with
>> upgradability (managable forward compatibility).
>>
>
> I'd certainly be very happy to see that, but I think we need to take one
> step at a time. We still don't know how to do modularity in the *absence* of
> versioning issues.
>

This I'm surprised at.

In the absense of versioning issues, don't most dynamic language runtimes
handle modularity? Java/JVM, C#/CIL, Python, Ruby, Javascript... all can
have interfaces specified (whether checked by a compiler or not) and load
compatible classes at runtime.

As far as I can see, both C-shlibs and CIL both (to some degree) at least
have some capability to handle multi-versioned dependencies because both of
them record the required dependency version in a way that is visible to the
runtime load/linker. Further, they can both load multiple versions of the
same shlib/DLL. Compare this with Java, Python, Javascript, etc.. which
don't have any record of such information to make available to the runtime,
nor can the runtime handle more than one version of a module/class loaded
because of name-clash.

Do we have a different definition of modularity? Ignoring the limits of
program expression, and simply thinking about how to get past the
C-warzone.. I'd be satisfied if I could have badly overspcified interfaces
between DLLs and actually load two modules without version typeclash. I
think C-shlibs and CIL both give me this. Do they not?


> There are two problems with this claim:  [regarding microsoft sinngularity
> 'tree shaking']
>
>    1. It disregards the fact that the two optimizations are orthogonal.
>    The ability to remove unreached code does not reduce the value of gathering
>    *reused* code in a common place.
>    2. The metric of interest isn't the size reduction in a single
>    program, but the total code footprint across the system as a whole (that
>    is: across *many* programs). The tree shaking approach results in a
>    system where *most* programs will duplicate a certain body of code
>    that is commonly used. That's the code that belongs in shared libraries.
>
> I see and agree fully with your view here. I wasn't aware of this 'tree
shaking'. When I referenced Singularity I was thinking of the attempt to
provide isolation between modules via a mechanism which resembles the type
of brute-force message-passing we do today (between kernel and user or on
socket interfaces), but without the copying... (i.e. singularity's
immutable object transfer model) I admit it was many years ago I read this
paper, so I may be remembering it incorrectly.

A3) software-virtual-machines provide some combination of features (JVM:
>> a,b,e,f,h), (MSIL: a,b,c,d,e,f,h,i,j), but are still both missing a
>> critical missing link to replace C-shared libraries... "e" (i.e.
>> deterministic soft-real-time performance), making them unsuitable for
>> layered subsystems. (because worst-case GC pauses are unacceptably large
>> both in large-heaps and layered small-heaps)
>>
>
> I'm not sure why you say that for layered small heaps, and I'm fairly
> convinced that it is wrong for large heaps *provided* concurrent
> collection is used. Unfortunately, concurrent collector technology hasn't
> been widely deployed.
>

I am unaware of a so-named "concurrent" collector which does not have an
unacceptable stop-the-world pause of the old-generation (I believe for
initial-mark). Certainly those in modern Java and C# have unacceptable and
unpredictable worst-case pause times. The only research I'm aware of which
attempts to solve this problem is Microsoft STOPLESS,
CHICKEN<http://www.cs.technion.ac.il/~erez/Papers/real-time-pldi.pdf>,
and hardware-assisted Pauseless
GC<http://static.usenix.org/events/vee05/full_papers/p46-click.pdf>by
Azul. None of these are available in generally useful x86
implementations.

In real systems we expect to be able to respond to user-actions 10-20ms.
This is not possible to do reliably with today's GC systems. In C manual
management or ref-counted systems, expensive work can be "hidden" from the
user by either manually amortorizing it or performing it when users are not
waiting. This is simply impossible in a GC system which unpredictably stops
the world. This in turn makes stop-the-world GC impossible to use for many
systems.

The problem of stop-the-world pauses are exacerbated by threading and
layered-systems. Much like cumulative MTBF, when many subsystems may have
unpredictable worst-cases pauses overlayed, the systemic worst-case pause
is additive. In a threaded heap, every thread is delayed by the world-stop.
We simply have to stop stopping the world.

It's all too easy to dismiss these pauses as insignificant, but in real
situations in real systems, we are avoiding GC everyday because of this
issue. It must be fixed for GC-based systems to replace C-shlibs.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.coyotos.org/pipermail/bitc-dev/attachments/20120409/dc86417b/attachment-0001.html 


More information about the bitc-dev mailing list