[coyotos-dev] SMP: transmap experiment, questions
Jonathan S. Shapiro
shap at eros-os.com
Wed Oct 24 08:34:41 EDT 2007
Jeroen:
As with several other places, you are discovering holes where I had
thought things were further along than they actually are. I had moved
the TransMetaMap and TransReleased variables into the CPU structure, but
I had failed to update the slot management code in transmap_map().
I am probably failing to see something, but from a quick look at the
TransMap code the only change that should have been necessary was to the
computation of "slot" in transmap_map(). Is this approximately correct?
Concerning the number of transmap entries, 64 entries is a lot more than
we need, but it is a perfectly okay choice for right now.
Could you make the following changes and submit a single patch with all
of them:
1. The needed updates in transmap.c
2. Add a new type transmeta_t to hal/transmap.h, and the corresponding
#define in target-hal/transmap.h.
3. Update CPU.h to use it appropriately.
4. Add a constant definition (a #define) in target-hal/transmap.h
TRANSMAP_INITIAL_METAMAP to ~0ull. Use this in kernel/kern_CPU.c to
initialize CPU.TransMetaMap, in cpu_construct().
5. Also your changes to size the transmap conditionally on MAX_NCPU.
This will allow us to alter the size of the per-cpu transmap window on a
per-architecture basis without having to re-touch all of this code
again.
If you would be kind enough to do that change as a single patch, I'll
apply it and we can get back to the rest of this.
On Tue, 2007-10-23 at 16:39 -0400, Jeroen Visser wrote:
> Hello,
>
> On 10/22/07, Jonathan S. Shapiro <shap at eros-os.com> wrote:
> > Today, the only part of the map that is per-CPU is the transmap, and
> we
> > handle that by simply reserving disjoint regions for different CPUs.
>
> I'm not sure I entirely understand this. Reserving disjoint regions
> for different CPUs? Is is possible to get away from a per-CPU map in
> the face of this?
>
> I'm not sure how this would work?
>
> As an experiment I modified the transmap so that doesn't rely on any
> per-CPU mapped data. Every CPU is given a different offset into the
> TRANSMAP_WINDOW, spaced 64 pages apart.
This is exactly how the transmap is intended to work.
> Since we currently have only 8MB available for the transmap window the
> kernel is restricted to 8MB / (64 pages * NCPU * PAGE_SIZE) = 32 CPUS.
I think that your computation is not quite correct. 8MB corresponds to
two page tables. In PTE mode, there are 1024 entries per table, giving
2k/64 entries => 32 CPUs. However, in PAE mode there are only 512
entries per table, giving 1k/64 => 16 CPUs.
JWA and I feel that once you get above roughly 16 CPUs, you are probably
at the point where a NUMA-style kernel design becomes more appropriate
than a tightly coupled memory design. Because of this, we are not
greatly concerned if the current kernel implementation is limited to
16-32 CPUs.
In practice, we can probably get away with a transmap window as small as
16 entries. 64 is better mainly because it involves fewer flushes. We
can lower the start of the transmap window if we need to.
Limiting MAX_CPU to 16 is fine.
Watch out when re-sizing the transmap. The size needs to be updated in
ldscript.S too!
> With this change in place I can run every CPU with exactly the same
> map (PDPT, KernPageDir, KernPageTable).
>
> Not sure about the performance implications though. It sure obsoletes
> a lot of mapping complexity.
Performance of this should all be fine. The important point is that no
two CPUs *ever* share a transmap entry, so transmap entries never
require an IPI for shoot-down.
shap
--
Jonathan S. Shapiro, Ph.D.
Managing Director
The EROS Group, LLC
www.coyotos.org, www.eros-os.org
More information about the coyotos-dev
mailing list