[coyotos-dev] IPC Redesign

Jonathan S. Shapiro shap at eros-os.com
Tue May 22 09:38:04 EDT 2007


We are going to fall back on a simpler IPC design. The attempt to use
the "lesser UPCB" has failed for performance reasons. The problem is
that additional indirections are required to deal with the lesser UPCB.
The capability to that page needs to be validated, the page needs to be
cross-mapped, and so forth. This creates locking and TLB overheads that
are quite bad.

We are going to revert to a design in which every transfer involves

      exactly 4 capabilities   IN/OUT
      a control word           IN/OUT
      protected payload        OUT
      endpoint ID              IN/OUT
   ** up to N data words       IN/OUT
      An indirect data string, IN/OUT
       possibly of length 0

We will make optional provision for scatter/gather, but we never really
needed that in EROS except in cases where N was too small.

The engineering question is: what should N be?

In EROS, N was 3. This was too small, because we wanted file offsets to
be 64 bits, and 3 32-bit quantities is insufficient to transfer a 32-bit
opcode, a 64-bit position, and a 32-bit length. As a practical matter
this was the ONLY case where we found 3 data words to be insufficient.
The real issue was that we wanted the entire string to use for the
buffer being transferred without having to do a marshal/demarshal on it.

Late in the EROS cycle, I realized that the segmentation mechanism could
be exploited to expand the number of available registers arbitrarily (or
Kauer's stack-based approach can be used).

The next most limiting architecture is AMD64 in long mode. This
architecture has 16 general-purpose registers, of which 14 are
potentially available at the system call boundary.

If we select N=8, the binding above can be implemented in 13 registers
(max). A message transmitting two payload words, one capability, and no
strings requires a total of 7 words at the syscall interface, of which 6
are carried in registers.

I provisionally believe that N=4 is enough for most real cases. Given
that IA-32 is not an impediment, AMD64 can handle N=8, and unused data
words don't cost anything, I am inclined to go with N=8.

IPCs requiring more than 8 data words will need to marshal into the
string. Rare IPCs will need to use the scatter/gather mechanism, which
is handled separately.

Reactions?


shap



More information about the coyotos-dev mailing list