Squeak SmalltalkJoker Squeak Smalltalk : System : prevnext Garbage Collection Performance Young Space

> I discovered a curious performance issue in Squeak (3.7 and 3.6, Mac
> and Windows):
>
> Squeak idling cpu use (2% on 3Ghz XP and 9% on 350 Mhz G3)
> then do it:
> t := Array new: 1000000.
>
> Array allocation is quick but cpu usage jumps to 15% (XP) 35%(on G3)
> and stays there.
> Tried it in VW 7.2 and it didn't do this.
> Using new: 1000 didn't trigger the cpu use either.
>
> Smalltalk garbageCollect.  "Fixes it even though the object is still
> instantiated"
>
> What causes this?
>
>

This shows nicely how the garbagecollector works. After starting up, 
VM Statistics (you find this under "help") shows:

uptime   0h0m39s

memory   26,956,320 bytes
 old   22,719,428 bytes (84.3%)
 young  338,732 bytes (1.3%)
 used  23,058,160 bytes (85.5%)
 free  3,898,160 bytes (14.5%)

GCs    520 (75ms between GCs)
 full   3 totalling 810ms (2.0% uptime), avg 270.0ms
 incr  517 totalling 2,241ms (6.0% uptime), avg 4.0ms
 tenures  2 (avg 258 GCs/tenure)

Ok. We see that we have an "old" space and a "young" space. Objects are
created in the young space. So after allocation the array, we see:

uptime   0h0m54s

memory   26,957,952 bytes
 old   18,561,168 bytes (68.9%)
 young  4,379,268 bytes (16.2%)
 used  22,940,436 bytes (85.1%)
 free  4,017,516 bytes (14.9%)


So your array was created in the young space, as all new objects. Now 
the young space is the only part of the memory that gets 
garbageCollected often (wie call this "incremental GC"). So after the 
Array was allocated, the incremental GC has to clean a huge young 
space (over 4MB), which takes time. Normal objects die young, so most 
garbage can be collected in the young space and the young space is not 
growing that fast. But some objects live longer (like the Array). They 
need to be moved to oldspace. This happens either when we do a full GC 
(which moves everything to old and does a GC in old). Or by "tenuring" 
objects from young to old without a GC. I'm not quite sure what the 
tenuring policy is in Squeak, but I think that it just moves all 
Objects of the young space to old space, when the number of surviving 
objects exceeds a certain treshhold after an incr. GC. This would not 
take space into account, so just allocating a huge array does not 
toggle tenuring. I think that Visualworks has either a third space for 
large objects, or large objects are created directly in oldspace 
somehow.

Marcus

Once the array is in NewSpace we iterate over all 1,000,000 elements 
on a newspace GC event to mark/trace the elements. This iteration of a 
million items does take time. This happens on every newspace GC event, 
so as you notice CPU usage goes up. The Smalltalk garbageCollect 
forces a tenure of the survivors in newspace to OldSpace by compacting 
the objects and moving the newspace pointer to the end of allocated 
memory. This drops the large array from the scope of the newspace GC 
event, it's tenured and becomes an old object. You'll note if you 
allocate a new object then drop it into the Array then it will cause 
the million array object to become a root object since it points to an 
element in newspace. This causes us to iterate over the million 
elements again and CPU time will go way up. Solving this issue is an 
exercise for the reader.