Squeak SmalltalkJoker Squeak Smalltalk : System : prevnext Low Process Priority Image Freeze

- Therefore, the UI process does not wait on the delay in 
WorldState>interCyclePause:

- Because the UI main loop only does a yield (see Process 
class>>spawnNewProcess) the UI process therefore stays runnable all the time 
as there is no other process with p = 40.

- Therefore, no process with p < 40 has a chance to be activated (only 
higher ones, which we find in the trace). This also explains why we see 100% 
CPU usage, but still the UI responds immediately.

This sounds like a reasonable explanation.

Now, why does moving the mouse make it run again? I have no idea... my guess 
is that the triggered behavior of a mouse move event somehow forces a full GC. 
In the trace we see that when the 107th full GC is done, there are much fewer 
incr. GCs later on. Hence, it is much more likely that the UI process pauses 
again.

Tenuring might fix it, too. And it may just be that your wiggling the mouse 
creates the bit of extra garbage to make the VM tenure.

How could we fix this?
-----------------------------------------

a) Simply increase the 20ms pause defined by MinCycleLapse (at least for 
production systems) or tweak the "pause logic". As a test I defined 
MinCycleLapse to be 40ms. I could not reproduce the problem anymore.

>

b) In production, suspend the UI process and only turn it on again when you 
need it (we do this via HTTP). This should also improve performance a bit. At 
best this is a workaround.

>

c) Tune the GC policies as they are far from optimal for today's systems (as 
John has suggested a couple of times). It seems, though, that this cannot 
guarantee to fix the problem but it should make it less likely to happen(?).

d) Don't use processes that run below user scheduling priority. To be honest, 
I'm not sure why you'd be running anything below UI priority on a server.

e) Make a preference "lowerPerformance" (or call it "headlessMode" if you wish 
:^) and have the effect be that in intercyclePause: you *always* wait for a 
certain amount of time (20-50ms). This will ensure that your UI process can't 
possibly eat up all of your CPU.

I'd be interested in getting feedback on:
- whether the explanation sounds plausible

It does. There is however a question what could possibly generate enough load 
on the garbage collector to run IGCs that take on average 7ms and run three of 
them in a single UI cycle.

- whether the fix (e.g., a)) workes for other people that have this problem.
- what may be a good fix

I'd probably go with option e) above since it ensures that there is always 
time left for the lower priority processes (and you don't have to change other 
code). Everything else seems risky since you can't say for sure if there isn't 
anything that keeps the UI in a busy-loop.