Squeak SmalltalkJoker Squeak Smalltalk : Image VM OS Application : prevnext Native vs Green Threads

As I understand it when processes are cooperative within a priority 
its easier to perform time sensitive operations (like audio) - you 
know nothing is going to interrupt the execution of a block of code 
partyway.  Its possible (and fairly straightforward) to build 
preemption on top of Squeak's threading model by using a high priority 
thread to interrupt lower priority ones.  The reverse is not true - 
you can't build a cooperative threading model on top of a preemptive 
one.

-- Here's what would be wrong with native threads:

- They would be less efficient than Squeak Processes. A Process is 
  very lightweight, and context switches in Squeak are very cheap. A 
  context switch using native threads is very expensive by comparison.

- They would be less scalable than Squeak Processes. Squeak can run 
  10K Processes in an image with ease. You just can't do that with 
  native threads

- They wouldn't run identically on all platforms. A single Squeak 
  image runs identically on all the platforms, and that includes 
  directly on the hardware, without an OS. With native threads, you 
  can approximately the same behaviour, on some platforms.

- Using native threads makes garbage collection really difficult, 
  particularly if you want to minimize pauses. Given that Squeak is 
  commonly used in multimedia applications, where GC pauses interfere 
  with smooth audio processing and animation, native threads would be 
  problematic.

As I understand it when processes are cooperative within a priority 
its easier to perform time sensitive operations (like audio) - you 
know nothing is going to interrupt the execution of a block of code 
partyway.  Its possible (and fairly straightforward) to build 
preemption on top of Squeak's threading model by using a high priority 
thread to interrupt lower priority ones.  The reverse is not true - 
you can't build a cooperative threading model on top of a preemptive 
one.

--

> Why are same priority processes not preemtable in Squeak?

Well, they are. If you need that, it's a peace of cake to write a 
round-robin time-slicing scheduler.

You know, that's *exactly* what happens for native threads, too - the 
OS kernel scheduler runs at higher priority and gives time slices to 
all processes. No process interrupts another one on the same priority 
on its own.

 > - Using native threads makes garbage collection really difficult, > 
 particularly if you want to minimize pauses. Given that Squeak is > 
 commonly used in multimedia applications, where GC pauses interfere 
 with > smooth audio processing and animation, native threads would be 
 problematic.

Garbage collection pauses may be a good reason to introduce VM level 
native threads. Making a parallel old space collector would remove 
pauses from the global GC. This is a problem for some multimedia work. 
Imagine you're using Squeak to process midi from a live keyboard, a 
half second pause will be very noticable when playing. A friend has 
this problem.

Large Seaside images may also take an uncomfortably long time for a 
full garbage collect.

I suspect that a parallel GC wouldn't be harder, and may be easier, 
than an incremental GC. Once we have a parallel GC then adding a few 
native interpreter threads shouldn't be too hard.

The papers that I looked at suggested that parallel new space 
collection was normally done by stop and copy collectors. So all the 
interpreters are stopped, new space is collected then the interpreters 
are restarted.

However, parallelisation would require changing the write barrior 
implementation to card marking. This would also reduce the interpreter 
(or compiled codes) write barrior overhead. My write barrior 
implementation takes 24 instructions, a card marking write barrior 
would take 1-3 instructions.

--

> Garbage collection pauses may be a good reason to introduce VM level
> native threads.

I'm very curious how it is proposed to make GC work in parallel to 
normal operation. Given the absolute need to keep the object memory in 
a sane state at any time the execution of code is proceeding (which is 
why using native threads is difficult in the first place) how could we 
be moving object data around whilst another thread (or many - remember 
the urgent, vital need for native threads) is relying upon the oops 
and pointers.

We did quite a bit of working on this in the late eighties and nobody 
came up with a plausible, practical solution. Has anyone done so in 
recent years?

tim

--
Using Semaphores or Monitors in application-level code is a symptom of 
irresponsible design.

Almost no one can get a multi-threaded application that uses more than 
one Semaphore to work reliably, regardless of the threading model in 
place. And it shouldn't require geniuses to make concurrency work. 
There have been a number of high-profile embedded systems failures 
because of bad concurrency design (search for "priority inversion" or 
"semaphore deadlock" for more background).

For instance, here's an article about what happened to the Mars 
Pathfinder mission: 
http://research.microsoft.com/~mbj/Mars_Pathfinder/Mars_Pathfinder.html

Instead, we need better models of task concurrency. When possible, I 
use SharedQueues or similar synchronized structures to couple messages 
into and out of processes. Even this simple scheme (similar to 
"mailbox" structures in many RTOS designs) helps considerably.

I've found that I can generally design smaller embedded systems (say 
<64K of code and <4K of data) using simple state machines and an 
interrupt-driven executive, giving myself robust concurrency without 
semaphores or deadlocks.

-- Ned Konz