As I understand it when processes are cooperative within a priority
its easier to perform time sensitive operations (like audio) - you
know nothing is going to interrupt the execution of a block of code
partyway. Its possible (and fairly straightforward) to build
preemption on top of Squeak's threading model by using a high priority
thread to interrupt lower priority ones. The reverse is not true -
you can't build a cooperative threading model on top of a preemptive
one.
-- Here's what would be wrong with native threads:
- They would be less efficient than Squeak Processes. A Process is
very lightweight, and context switches in Squeak are very cheap. A
context switch using native threads is very expensive by comparison.
- They would be less scalable than Squeak Processes. Squeak can run
10K Processes in an image with ease. You just can't do that with
native threads
- They wouldn't run identically on all platforms. A single Squeak
image runs identically on all the platforms, and that includes
directly on the hardware, without an OS. With native threads, you
can approximately the same behaviour, on some platforms.
- Using native threads makes garbage collection really difficult,
particularly if you want to minimize pauses. Given that Squeak is
commonly used in multimedia applications, where GC pauses interfere
with smooth audio processing and animation, native threads would be
problematic.
As I understand it when processes are cooperative within a priority
its easier to perform time sensitive operations (like audio) - you
know nothing is going to interrupt the execution of a block of code
partyway. Its possible (and fairly straightforward) to build
preemption on top of Squeak's threading model by using a high priority
thread to interrupt lower priority ones. The reverse is not true -
you can't build a cooperative threading model on top of a preemptive
one.
--
> Why are same priority processes not preemtable in Squeak?
Well, they are. If you need that, it's a peace of cake to write a
round-robin time-slicing scheduler.
You know, that's *exactly* what happens for native threads, too - the
OS kernel scheduler runs at higher priority and gives time slices to
all processes. No process interrupts another one on the same priority
on its own.
> - Using native threads makes garbage collection really difficult, >
particularly if you want to minimize pauses. Given that Squeak is >
commonly used in multimedia applications, where GC pauses interfere
with > smooth audio processing and animation, native threads would be
problematic.
Garbage collection pauses may be a good reason to introduce VM level
native threads. Making a parallel old space collector would remove
pauses from the global GC. This is a problem for some multimedia work.
Imagine you're using Squeak to process midi from a live keyboard, a
half second pause will be very noticable when playing. A friend has
this problem.
Large Seaside images may also take an uncomfortably long time for a
full garbage collect.
I suspect that a parallel GC wouldn't be harder, and may be easier,
than an incremental GC. Once we have a parallel GC then adding a few
native interpreter threads shouldn't be too hard.
The papers that I looked at suggested that parallel new space
collection was normally done by stop and copy collectors. So all the
interpreters are stopped, new space is collected then the interpreters
are restarted.
However, parallelisation would require changing the write barrior
implementation to card marking. This would also reduce the interpreter
(or compiled codes) write barrior overhead. My write barrior
implementation takes 24 instructions, a card marking write barrior
would take 1-3 instructions.
--
> Garbage collection pauses may be a good reason to introduce VM level
> native threads.
I'm very curious how it is proposed to make GC work in parallel to
normal operation. Given the absolute need to keep the object memory in
a sane state at any time the execution of code is proceeding (which is
why using native threads is difficult in the first place) how could we
be moving object data around whilst another thread (or many - remember
the urgent, vital need for native threads) is relying upon the oops
and pointers.
We did quite a bit of working on this in the late eighties and nobody
came up with a plausible, practical solution. Has anyone done so in
recent years?
tim
--
Using Semaphores or Monitors in application-level code is a symptom of
irresponsible design.
Almost no one can get a multi-threaded application that uses more than
one Semaphore to work reliably, regardless of the threading model in
place. And it shouldn't require geniuses to make concurrency work.
There have been a number of high-profile embedded systems failures
because of bad concurrency design (search for "priority inversion" or
"semaphore deadlock" for more background).
For instance, here's an article about what happened to the Mars
Pathfinder mission:
http://research.microsoft.com/~mbj/Mars_Pathfinder/Mars_Pathfinder.html
Instead, we need better models of task concurrency. When possible, I
use SharedQueues or similar synchronized structures to couple messages
into and out of processes. Even this simple scheme (similar to
"mailbox" structures in many RTOS designs) helps considerably.
I've found that I can generally design smaller embedded systems (say
<64K of code and <4K of data) using simple state machines and an
interrupt-driven executive, giving myself robust concurrency without
semaphores or deadlocks.
-- Ned Konz