[coyotos-dev] SMP Issue: load correct kernel stack on trap

Jonathan S. Shapiro shap at eros-os.com
Mon Oct 22 11:24:13 EDT 2007


On Mon, 2007-10-22 at 11:12 -0400, Jeroen Visser wrote:

> > 0. Several parts of the patch are cosmetic changes or (in a few cases)
> > temporary development changes. These should be separated or removed.
> 
> I hope these can be accepted piece-meal for commit.

Definitely! I just want to keep the cleanups separate from the content.
It helps reviewers.

> > 1. Your code finds the current CPU from the current process. In the
> > interrupt handler this is necessary, but in C code it is not the best
> > way. We already had a mechanism for this. See target-hal/cpu.h and the
> > revised definition of MY_CPU. We chose to get the current process from
> > the current CPU.
> 
> I assume that when we have a correct per-CPU-stack established we can
> simply get MY_CPU via its address on the bottom of the per-CPU stack?

Yes. The problem is that during the user register save sequence we are
on the wrong stack, so we need to load the current CPU pointer from the
Process structure. My changes to interrupt.S from this morning should
now do that properly.

> > We can change this if there is a reason to do it, but we thought this
> > approach would probably work better.
> 
> I had not particular reason other than already having figured out how
> to get from curProc to CPU to kstack in the interrupt handler.

This needs to be done carefully, because there are cases of kernel
interrupts where CPU->curProc may be NULL. In those cases you will be
running from the correct kernel stack already, and you can look at the
MY_CPU pointer at the bottom (least address) of the current stack.

> > 2. We chose to store the LEAST address of the stack in the CPU structure
> > because that is where the current CPU address needs to go. The top of
> > stack can be computed from cpu->stack+KSTACK_SIZE. You should only need
> > that when you are initializing stack for the APs.  Oh. Now I see why you
> > did it. There may be a better way, but that was certainly reasonable.
> 
> Looks like you changed this in tip.

That change went in more than a week ago. I suppose it is possible that
it didn't push for some reason. If so, I apologize.

> > The addition of KSTACK_MASK and KSTACK_SIZE in asm-offsets.c should not
> > have been necessary -- you should have been able to use them directly
> > from config.h.
> 
> Ok. I got errors when accessing KSTACK_SIZE this way. I thought
> because KSTACK_SIZE is defined as a C multiplication. In the interest
> of expediency I added it to asm-offsets.c which allowed me to get on
> with the rest.

Understood. I was able to use it directly in the revised version of
interrupt.S this morning, but I needed to add an include of hal/config.h
to do it.

> > 3. We decided to move all CPU-local data into the CPU structure. The
> > "percpu" region in ldscript is now empty, and will go away as we clean
> > up the SMP implementation.
> 
> Aha!

Sorry about that one. We were going back and forth because we had not
made any decision about loadable kernel modules. We have now decided
that we will not do loadable modules, so there is no good reason to do
the more complicated approach. Also, the number of per-CPU variables is
much smaller then we initially feared.

We considered using a segment register to fetch curCPU, but that code
would be very processor specific. We haven't measured which way is
faster, and we need to.

> > 4. We intend for all processors to have the same private page table and
> > private page directory, so there is no need to allocate per-AP tables in
> > hwmap.c. Is there a reason you did this that we may not have noticed?
> 
> Because of me not being aware of point 3 above? Having a shared kernel
> map structures will clean up my patch significantly. In particular I
> no longer will have to contemplate updating cpu_ncpu maps when we grow
> the kernel heap.

Yes. Just so you know, our original plan there was to treat the cpu0 map
as the "definitive" map, and copy entries from it on demand (in response
to kernel page faults). This is simpler than what you were doing, but it
is delicate.

Today, the only part of the map that is per-CPU is the transmap, and we
handle that by simply reserving disjoint regions for different CPUs.

> > 5. We definitely care about SMP on non-PAE machines (for efficiency
> > reasons).
> 
> Ok.

Now that the kernel maps are not per-CPU, this is probably easier. The
issue is that there are more layers (therefore more memory references)
when you use PAE, and unless you have a large physical memory or you
require NX support there is no real reason to accept the extra cost.

> > 6. I have not explored your code enough to understand this, but we
> > definitely want a way to force SMP machines to operating with only one
> > CPU. Does your patch preserve a way to do this?
> 
> If you force cpu_ncpu to 1 everything will work just fine without
> starting any APs.

Okay. I think the "nosmp" command line option effectively does this
already.
-- 
Jonathan S. Shapiro, Ph.D.
Managing Director
The EROS Group, LLC
www.coyotos.org, www.eros-os.org



More information about the coyotos-dev mailing list