[coyotos-dev] syscall/sysret vs. small spaces
Espen Skoglund
esk at ira.uka.de
Tue Nov 8 04:31:22 EST 2005
[Jonathan S Shapiro]
> On Mon, 2005-11-07 at 21:19 +0100, Espen Skoglund wrote:
>> To be perfectly honest I have not kept myself up-to-date with the
>> latest internal Pentium developments regarding TLBs and caches.
>> However, considering that IA32e uses a 4-level page-table
>> structure,...
> Yes. The issues for IA32e are challenging.
> I also agree about the P4 trace I-cache, though the real-world cost
> of that is challenging to understand because it depends heavily on
> retained I-cache working set. Given the small size of the P4's L1
> trace cache, I'm not sure I believe that any significant amount of
> application state remains live in the trace cache under normal
> execution conditions.
Perhaps. Our findings was that using a synthetic workload of 8K nops,
avoiding to flush the trace cache on every context switch doubled the
performance. Using the same workload, but using L4Linux and pipes
also gave a measurable performance boost (and Linux pipes is by no
means light-weight).
> If the kernel pages are marked global, and/or the address space flip
> is done late enough in the path, then it may be that we retain all
> of the performance that is feasible under conditions of real usage.
How does global pages help in the case of the trace cache?
eSk
More information about the coyotos-dev
mailing list