[coyotos-dev] First Class Activations, Second Class IPC: a UNIXoid and inaccurate summary

Jonathan S. Shapiro shap at eros-os.org
Sat Jan 21 11:51:02 EST 2006


On Fri, 2006-01-20 at 11:45 +0100, Dominique Quatravaux wrote:

> >The FCRB design is trying to address the
> >interaction of two requirements:
> >
> >  1. The need for preemptive event delivery
> >  2. The need for non-blocking sends, which implies the need for
> >     non-preemptible receives.
> >
> So the point of the discussion is to revise the domain state machine
> (section 4.1 of http://www.eros-os.org/devel/intro/ProgrmrIntro.html).
> In the new system, a domain (very similar to a UNIX process) can now be
> available *and* running.
> 
> Precisely a domain can now have three states, scheduler-wise:
> 
>    1. running and busy ("activation flag" set);
>    2. running and available ("activation flag" cleared);
>    3. blocked (waiting for the kernel to awaken it).

I can see that I have created great confusion. Good! It really IS
confusing!

PREFACE COMMENTS

First, a reminder: the FCRB idea is provisional!

Second: there is a pending question about scheduling in Coyotos/FCRB, so
it's not entirely clear how to answer your question concerning process
states. What I am going to give below is my current view, but it may get
altered as we develop this idea.

RESPONSE:

In the Coyotos/FCRB design, the entire state model changes radically.
The available and waiting states in EROS and Coyotos/EP are types of
receive state. In Coyotos/FCRB, ALL receives are asynchronous, so there
is nothing comparable to a receive state at all. A process simulates a
blocking receive by initiating an asynchronous receive and voluntarily
giving up the processor. It simply chooses not to run the next
instruction after the SendAndReceive kernel operation.

Let me try to define how the equivalents to the classical "running",
"ready", and "blocked" states work. The correspondence is not perfect,
and we are going to need to introduce a new state called "idle".

Here are the "classical" process states that you will find in any modern
OS textbook. These are the states from the *kernel* point of view:

  running: Process is actually initiating instructions on some CPU.
  ready:   Process is prepared to run instructions, but is waiting
           to be scheduled to a CPU.
  blocked: Process is NOT executing instructions, because some condition
           must be satisfied first. The process is said to be blocked
           on this condition, and will go to the ready state when the
           condition is satisfied.
  idle:    Process is not attempting to initiate instructions. You
           will not find this state in the classic state diagram,
           because the classic diagram was not conceived with asynchrony
           in mind. The PC of an idle process always points to a
           user-mode instruction.

Note that the "ready" state is really a specialized "blocked" state; the
process is blocked on the CPU resource. The EROS implementation actually
treats it this way.

In Coyotos/FCRB, we distinguish two kinds of activation state:

  activated: process is running within the activation handler.
  normal:    process is running normal instructions.

The activation state is cooperatively managed by user mode code and
kernel mode code. The running/idle state is purely a kernel state
transition.

In Coyotos/FCRB, there is no blocked state, and really there is no
"ready" state either. A process can be in one of four states:

  running+activated: Process is executing instructions in the
                     activation handler on some CPU.
  running+normal:    Process is executing normal instructions.
  idle+activated:    Process is not initiating instructions. At the
                     time that it paused, it was running in the
                     activated state.
  idle+normal:       Process is not initiating instructions. At the
                     time that it paused, it was running in the normal
                     state.

A process goes from running to idle in two ways:

  1. Its current scheduler slice expires, or
  2. It voluntarily enters the idle state, giving up its slice.


CONSEQUENCES OF EVENT/FCRB ARRIVAL:

The following description is conceptual. I'm sure the implementation
requires many optimizations.

When an FCRB bound to a given process "fires", it is enqueued on a
per-process pending event queue, and the following state transition
occurs:

  PROCESS WAS:       PROCESS IS:

  */normal           running/activated. Registers have been partially
                     saved and incoming FCRB state is delivered to
                     process.
  running/activated  No change in state. Delivery deferred.
  idle/activated     running/activated. Delivery deferred.

If I have understood the Nemesis/Psyche work, this is the state diagram
of those systems as well (I am sure the names are different).

The reason that the scheduler activations design does not require a
"ready" state is that process dispatch is handled by an event posted by
the scheduler. If the process is idle/activated, this simply causes
execution to resume. If the process is idle/normal, this causes a
transition to running/activated, which quickly transitions back to
running/normal.

In Coyotos/FCRB, we may not do scheduler dispatch this way, because
there is an open security issue about disclosure of precise timing
information (another note on this is coming). If we choose not to
disclose scheduler run-in events, then that event type will instead
cause an idle->running transition *without* any change in the activation
state.

Note that the kernel never initiates a transition from activated to
normal. Either the kernel or the process may transition from normal to
activated.


Back to Dominique's specific questions:

> Precisely a domain can now have three states, scheduler-wise:
> 
>    1. running and busy ("activation flag" set);
>    2. running and available ("activation flag" cleared);
>    3. blocked (waiting for the kernel to awaken it).
>
>  When in [activated] state, the domain's control flow (at the assembler level)
> is never altered by the kernel: this is similar, although not
> identical, to blocking signals in UNIX. Messages arriving during
> [activated] state are not delayed (unlike UNIX - the Coyotos kernel
> never delays, delays are EVIL!), but instead silently posted to the
> domain's FCRBs provided they are short (this is only my best guess,
> see below) and discarded otherwise.

Close, but not quite. Messages that arrive to a domain that is already
activated are queued for later delivery. There is a missing detail in my
previous notes, which is that the kernel needs to advise the domain that
there are more messages queued. This gives the domain a chance to deal
with them voluntarily before resuming the preempted thread of execution.

> States 2 and 3 are similar in every respect, except that the domain
> isn't executing instructions in 3 (transition from 2 to 3 is similar to
> a processor executing the HLT instruction, but with interrupts enabled).

>From my description above, I hope that the answer to this is now
clearer. It does not work in quite the way that you describe.

> While in state 2 and 3, the domain is able to receive messages in a way
> very similar to UNIX signals: when a message arrives, the kernel causes
> a jump to it's "activation entry point" with the stack pointer set to
> the "activation stack", which are both arranged prior (how: fixed
> address? Kernel call? Part of the shared-memory mailbox detailed below? 
> Doesn't matter that much in the big picture anyway).

Kind of. The activation PC and SP are part of the kernel process state.
They are set by the process. In contrast to signals there is only *one*
activation stack.

> I understand there is a bit to set at IPC send time that means "block me
> right after that", but one should think of that mechanism as an artefact
> providing atomicity, rather than of a UNIXoid blocking system call. I
> therefore assume that there has to be some kind of user-mode HLT
> instruction - maybe blockingly invoking a null cap?)

Probably just a HLT system call.

> FCRBs look a lot like pollable events to me. Each of them represents a
> condition that the domain is waiting to become true, the authority of
> causing this to happen being vested in a capability (of course).

Yes. FCRBs differ from scheduler activation events in that

  1. They are first-class
  2. They are allocated accountably.
  3. Because they are first-class, they can carry a little more
     state, and can therefore be extended to handle IPC receive
     events.

> The ability to have FCRBs resolved at any time, even in state 1, means
> that the process is continuously executing a select() system call, so to
> speak, even while it is actively executing instructions.

Yes.

> "Blessing" a
> range of memory into being an FCRB is up to the domain (probably by
> invoking the kernel?).

The FCRB itself is a kernel object. The extended receive area is merely
an addressable region of the process address space.

> FCRBs can even be made re-usable for conditions that are
> expected to occur several times.

Yes, though multiple deliveries may be compressed into one in this
situation.

>  FCRBs can be stored into or removed
> from a kernel queue, just like one inserts or removes file descriptors
> in the select() lists (although it is unclear to me why there should be
> *several* kernel queues for a given Coyotos domain, or what the
> "broadcast" feature of awakening several domains at once is useful for).

Yes. You want several queues because you need the ability for different
senders to invoke capabilities that have different protected payloads.

> As indicated above, I'm not sure how all this relates to the IPC
> payload-moving mechanism...

Given my later mail and your response, I will not answer this here.
Please let me know if the later response has not answered your questions
about this.

> In other words, IPC is now second-class to activation, unlike
> EROS where the two were tightly integrated: the slice-it-all microkernel
> approach is winning the day, again.

IPC is now divided into two parts: delivery (asynchronous) and event
posting. It is the event posting that causes the activation. I'm not
sure that either should be called "second class", but you have the
essence of it.

Whether FCRBs are the ultimate redemption of mankind remains to be
seen. :-)


shap



More information about the coyotos-dev mailing list