> What is stopping Squeak from doing SMP? (or Async MP for that matter?)
I don't think it's quite as simple as it might seem
>
> The Squeak VM classes seem to be pretty finely threaded, and
> increasingly modularized.
Actually, very little of the class tree is thread-safe. Consider one thread
adding items to a collection whilst another removes them - you'd need
interlocks/monitors/semaphores/whatever. We don't have them installed everywhere
and likely never will. Unless of course someone really feels the urge to
completely rewrite a huge amount of code to support such fine threadiong
safely. In practice you can get away with a huge amount, as always. Just don't
sell the code to a nuclear powerplant operator, please!
> Why can't one make a copy of the OS/platform specific guts of the Squeak
> bytecode interpreter or JIT compiler object for each physical
> processor (maybe make it live in the processor cache since it isn't
> THAT big) and have the scheduler serve threads (or whole processes
> if they're independent enough) to available processors using a
> priority scheme of one's choice (which, this being Squeak, should be
> able to be changed out or altered on the fly if available hardware
> or loads change during execution).
First problem is that in general processor caches ar not under our control; the
cpu has hardware that caches lines of some size and as process switches occur
differeent cachelines get memory from many places and maybe some flush some
lines as part of a context switch and some don't.... basically the idea that
"hey our little interpreter can fit in the cache" is not real. Except,
interestingly enough in my favourite, the ARM. The latest ARM architecture
allows for a sort-of cache that IS under application control and WOULD allow for
the VM to be loaded along with crucial data and kept there. Up to 4Mb of it,
which is certainly enough for a lot of useful stuff. Of course, there are then
issues of which application gets control of this TCM etc. And it's a bit tricky
to actually go out and buy an ARM v6 cpu right now. Next problem is sharing the
object memory efficiently and reliably. Address spaces? Garbage collection?
Referential integrity? Is there a single object space or many? Do all cpus think
of themsleves as sharing the same actual memory or do they have separate memory?
Even if you could have multiple execution units sharing exactly the same memory
space (hmm, another ARM, the MPCore springs to mind) I think it would be a
goodly bit of work. To some extent you could easily benefit from 'normal'
multithreading in the VM (the OSX & windows VMs certainly do some) to handle
user input, socket signals, stuff like that. Perhaps dedicating a cpu to
tracking memory usage and modifying GC policy, watching code usage and
asynchronously heavily optimising some chunks of translated code, even perhaps
doing things like cleaning up memory left behind by comapaction (so object
allocation could avoid havign to scan the area). HP used to sell a distributed
Smalltalk (I think they passed it back to Cincom but I'm not sure) but that was
more a multiple Smalltalks talking to each other via something like CORBA.