Squeak SmalltalkJoker Squeak Smalltalk : System : prevnext Threads

> I was curious how threads or process forking is implemented in Squeak.  Do
> consistent concurrency patterns exist that can be implemented amongst
> differnt flavours of Smalltalk?  If so, what are the implications as such?
>

In Squeak, a good way is to communicate with message queues, as
implemented in SharedQueue.  Otherwise, you can use Semaphores as
mutexes.

Ruby has its own threads, much like Squeak's (similar to Java's "green
threads"). Ruby takes more care to avoid blocking on file I/O, however.
Squeak has had async file I/O on some platforms, but I don't think it's
universal.

There are people who would like to see native threads being used (in Ruby as
well as in Squeak) but this would be a pain because of the different thread
APIs (and Ruby, like Squeak, also runs on platforms that don't necessarily
have their own threading model).

>  Utimately
> you only get signaling since threads are just a subprogram/process.  But it
> is nice to be able to have an API that deals with issues like "join",
> "rendevous", "wait", mutex, sem etc..

Which can be done regardless of native/non-native threading. Squeak has a
Semaphore class that can be used for a variety of patterns. Ruby has its
global Thread.critical flag.

> However if the pattern is to use "message queueing" in Smalltalk (or maybe
> just squeak?) then perhaps it is the better approach.  I guess you can
> fork, create a new process or whatever all with the same mechanism so it
> sounds very appealing.  What I wanted to know is if this is the general
> approach or is this just a "workaround" for green threads?

I don't know what you mean about "workaround". As Lex said, the basis of
concurrency management in Squeak is the Semaphore (and of course the
scheduler itself).

There really is no good reason (that I can think of) for supporting
Smalltalk execution on mutiple OS threads on a single CPU machine.  It
will only degrade your performance and not offer any concurrency
capabilities that Squeak doesn't already have.  Supporting asynchronous
primitives using OS threads is a different matter.

We should however make Squeak capable of utilizing all the CPUs and a
multi-CPU system.  This would involve one OS thread per CPU, which
appears like another Processor in Squeak and scheduling Squeak Processes
on those Processors.  We would need a mechanism for ensuring that the
multiple processes do not step on each other, which would likely involve
a change in the object format...Linda is probably a good conceptual
model for this.  You'd also want to maintain a single-CPU and multi-CPU
version of the VM such that users on single-CPU machines are not
penalized by the additional accounting required in the multi-CPU VM.

In Squeak threads, the concept of "parent" and "child" isn't too important.
Once the thread (Process in Squeak) is created, the relationship isn't
recorded anywhere.

The threads both continue to run.

One common design pattern is where a worker thread receives requests using a
SharedQueue. It blocks every time it does a "queue next" and there isn't
anything on the queue. Other threads that want to send it work to do use
writes to the queue.

For the concurrent server, you'd accept a new connection then fork a thread
to process it. Which might communicate with other threads using SharedQueues
or use resources serialized by Semaphores.

Other patterns (for instance, futures or promises, timeouts, thread joins,
etc.) are also possible using Semaphores as the primitive synchronization
mechanism.

Ah.  Well, those aren't floating around in Squeak, though they can be
implemented on top of semaphores.  For rendevous (is that like a
barrier?) of k threads, signal the semaphore k times and then have
everyone do a "wait" at the semaphore.  For mutex, signal it once to
start with, and then go into a critical block by wait-ing and then
remember to signal on the way out (see the #critical: method).  Are join
and rendevous different?  Blah, it's been a while since I studied this
stuff.

It's really a bug-prone way to write concurrent programs.  If you signal
one time too few, then your barrier lets threads through a little too
quickly.  More commonly, if you forget to put a mutex around certain
chunks of code, then multiple threads can suddenly clobber each others'
work.  The situation is a lot like with manual memory management: for
simple systems, you can do it by hand and have a simple result.  For
complex systems, you can drive yourself crazy, and you are almost sure
to make mistakes that are very hard to track down.  Message passing, on
the other hand, isn't as bad: all the thread-relevant code is right near
the message-send and message-receive calls.

Another thing to consider is that in Squeak, more threads isn't going to
gain you performance.  Thus, the reason to use threads in Squeak would
be for design: sometimes you are writing a program that is inherently
parallel.

Now, the "inherent" parellism of a program seems to vary widely
depending on the observer, meaning in fact that it's not inherent after
all.  :)  I personally see these opportunities pretty rarely.  Perhaps,
though, this is from using Squeak so much -- the development environment
works much better for single-threaded programs.