[coyotos-dev] Some things to consider
Haplo
starfirex at comcast.net
Thu Nov 30 02:44:13 CST 2006
I've been researching swarm intelligence and other applications of
stochastic optimization, and came across a few things.
First is LURCH. LURCH is a debugger/software verification tool which
uses random monte-carlo searching on an application to find linked
clusters of problematic code. Because of the way it operates, LURCH
can very quickly (minutes as opposed to hours or days for
deterministic approaches) find the vast majority of compromising code
in your application. Also, its resource usage does not grow very much
even for very large projects. While unsuited for formal verification,
it can greatly speed the process of correcting your current code and
adding new code, while making it much safer to add code to the trunk
build. Since you no longer have to rely on by-hand analysis of your
code, you no longer simply have to triple check and pray that you got
it right. Finally, when it comes down to formal verifications for
release builds, after the long process of verification is over,
you'll have much less code to dig through and fix since you'd have
been doing it incrementally the entire time. LURCH should allow
speeding the development of coyotos and allow a stable (and fully
verified) release build to be produced more rapidly.
Information about LURCH:
http://en.wikipedia.org/wiki/LURCH
Further info about the principles used:
http://en.wikipedia.org/wiki/Global_optimization
Now, the main goal of these techniques is to quickly locate global
optima (which takes great deals of time using conventional methods
due to the large numbers of variables involved and the time required
to absolutely graph them all). While these methods may not find the
absolute global optimum, they can produce values which are very very
close and do so very quickly. One such example would be distributing
bandwidth on a communications network. It occurred to me that this
sort of problem is very similar to scheduling, only with scheduling
you can know almost all of the variables for a given application (and
its dependents) in need of scheduling. As such, you could fairly
easily create a process to scan an application (like LURCH does) to
find its connections and determine optimal or near optimal scheduling
for its statistically most significant communications (IPC) paths, as
well as a fallback scheduling for other things, and scheduling for
worst-case performance situations (ie maxed system load).
Furthermore, you could also use it to generate global tables to
ensure near-optimum global performance at all times while still
maintaining good to near optimal local performance for individual
applications.
Comparison with software feedback scheduling:
-Software feedback scheduling requires unauthorized channels to
provide the basis for the scheduling. Stochastic pre-optimized
scheduling does not (it reads through the normal IPC/capability
sequences most often used by the applications).
-Software feedback scheduling has a degree of hysteresis in
compensating for new or varying loads. Furthermore, it requires
constant guess-and-check adjustment which can take considerable
processing time itself in a situation with highly variable load (or
else increases its hysteresis which reduces its effectiveness). SPO
would require one sum of time when installing a new application for
building the application's scheduling table and adjusting the global
table (these would be similar to capability tables). After that,
scheduling would be reactive based on the tables. Hysteresis would be
minimal and scheduling would simply mean reading the tables based on
the current request by the application and the state of the system
(no guessing).
-Software feedback also has to deal with new situations as they come.
SPO would be redundant, so it's run a few times read many. There are
essentially no 'new situations' to SPO, it already basically 'knows'
what to do in given situations. It doesn't perform quite as well
under statistically unlikely conditions (code that's only run
rarely), but the failsafe scheduling is still reasonably optimal or
at least very acceptable.
-Software feedback works towards local optima. That is, it maximizes
the connections between two applications. As such, under worst-case
performance (processor is overburdened) it will perform very badly.
SPO pre-optimizes for such situations and can provide decent to very
good system wide performance even under the heaviest loads. That is,
you shouldn't get any beachballing or frozen GUIs just because the
system is working on a large problem, but the large problem will
still get ample resources to complete on time (or in a reasonable
time if there is no specific timeout constraint).
-NICE value type scheduling when compared to software feedback is
grossly inaccurate, somewhat like a manual transmission only you only
change gears once every few thousand miles. Software feedback,
however, is still considerably less accurate than SPO overall,
somewhat like a continuously variable transmission vs a CVT that
already knows where the hills are.
Although its usage would be quite limited for embedded applications
where the system is very small (that is, scheduling is simple), I
should imagine that if it works, it would make coyotos perform
significantly better than monokernels in the majority of situations.
This is something I would really like to see in a formally verified
capabilities based KOS. That would make it easy to popularize
coyotos, which could change the general opinion on microkernels.
More information about the coyotos-dev
mailing list