Squeak SmalltalkJoker Squeak Smalltalk : Image VM OS Application : prevnext Exupery JIT Minimize Instructions

 > In Java
 > 
 >   long x = 0;
 >     long timeStart = 0;
 >  
 >     timeStart = System.currentTimeMillis();
 >  
 >     for (int i = 0; i < 1000; i++) {
 >         for (int j = 0; j < 100000000; j++) {
 >             x = x + 1;
 >         }
 >     }
 >  
 >     timeStart = System.currentTimeMillis() - timeStart;
 >     System.out.println(new Long(timeStart).toString());
 >     System.out.println(new Long(x).toString());
 >  
 > ==> 487 secondes.

How long is a Java long? What does it do when it overflows?

I calculated that the answer is 100,000,000,000 which is much
bigger than a SmallInteger so Squeak will be using arbitrary
precision math (BigIntegers).

Try a loop like:
    | x |
    x := 0.
    1
	to: 100000000
	do: [:each| x := x + 1].

Another problem is timesRepeat: is a real method that uses a real
block.  That's currently expensive. Think a full message send rather
than just bytecode execution.

Put the code in a method and view the bytecodes to see what's
going on.

Yes, it is possible to optimise away the block overhead but
currently we don't.

It takes my computer about 8 seconds for my (using the
interpreter). After compiling with Exupery my loop 1.5 seconds to
execute. (I've removed the 1000 timesRepeat: to speed things up and
avoid LargeIntegers).

So why is interpreted Squeak 16 times slower than Java?
   1) It's interpreted
   2) It tag checks the arguments then detags them
   3) It checks if the result overflows then it tags the result
   as an integer.

For the interpreter only 1) really matters, the CPU seems to branch
mispredict about once per bytecode. That takes around 10-30
cycles. The rest of the interpretation time is hidden by that.

2) and 3) matter if you're compiling Squeak. They could be
removed with global type analysis. 

Given enough programming time Squeak could compete.

Bryce
---

I benchmarked just the inner loop to avoid the big integer problems.
I get 8 seconds interpreted and 1.5 seconds compiling with Exupery.

Knowing how sloppy Exupery's generated code is Java isn't that hard
a target to compete with.

Squeak needs to tag check, detag, add, overflow check, then retag
each instruction. Given the current object format that's at least
8 instructions. Java can probably manage with 2 assuming it checks
for overflow. Exupery generates 10. Exupery is also storing
temporaries in the MethodContext rather than registers.

2 instructions could be achieved with sufficient effort. Probably less
than John's tens of millions.

Bryce
--
> Besides, one of the more interesting things to do is to not use 
"long/int" but rather the Object integer type.


Yep. On my machine the speed then is almost equal. Mind you, Squeak 
Interpreter vs. Java JIT:

    | x |
    x := 0.
    (Time millisecondsToRun: [
        1 to: 2 do: [: i|
            1 to: 100000000 do: [:j | x := x + 1]]]) -> x

80244->200000000

=================

class Numtest
{
    public static void main(String args[]) {
    Long x;
    Long timeStart;
       Integer one = new Integer(1);
    
    x = new Long(0);
    timeStart = new Long(System.currentTimeMillis());
    
    for (Integer i = new Integer(0); i.intValue() < 2; i=new Integer(i.intValue()+one.intValue())) {
        for (Integer j = new Integer(0); j.intValue() < 100000000; j=new Integer(j.intValue()+one.intValue())) {
        x = new Long(x.longValue() + one.longValue());
        }
    }
    
    timeStart = new Long(System.currentTimeMillis() - timeStart.longValue());
    System.out.println(timeStart.toString());
    System.out.println(x.toString());
    }
}

% java -version
java version "1.4.2_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_05-141.3)
Java HotSpot(TM) Client VM (build 1.4.2-38, mixed mode)
% javac Numtest.java && java Numtest
66851
200000000

==============

Squeak: 80 seconds
Java: 66 seconds

Meaning they are almost equal, with Java being faster by 20%, but by 
not several orders of magnitude. Not bad for a pure Interpreter, I'd 
say. Would be interesting to try with Auto-Boxing/Unboxing, but I 
don't have a newer Java version installed.

- Bert - 

Here are the timings on my machine using JDK 1.5 and the autoboxing facilities

Squeak 3.7                                         -> 23899
JDK 1.5, java primitives                      -> 2844
JDK 1.5, autoboxing and wrappers     -> 60487

So idiomatic Java is an order of magnitude faster than Squeak in this 
test(if it is worth calling it a test), but squeak is twice or so as 
fast if you use primitive wrappers and autoboxing.

Of course this is a highly contrived example, purpose designed to 
make java look bad by using contrived non-idiomatic code.

Rob

.....
class NumtestIdiomatic
{
   public static void main(String args[])
   {
       long x = 0;

       long timeStart = System.currentTimeMillis();

       for (long i = 0; i < 2; i++)
           for (long j = 0; j < 100000000; j++)
               x++;

       System.out.println(System.currentTimeMillis() - timeStart);
       System.out.println(x);
   }
}
.....
class NumtestWrappersAndAutoboxing
{
   public static void main(String args[])
   {
       Long x = 0l;

       Long timeStart = System.currentTimeMillis();

       for (Long i = 0l; i < 2; i++)
           for (Long j = 0l; j < 100000000; j++)
               x++;

       System.out.println(System.currentTimeMillis() - timeStart);
       System.out.println(x);
   }
} 

--
>Of course this is a highly contrived example, purpose designed to make 
>java look bad by using contrived non-idiomatic code.

No, it's trying to compare vaguely alike code. The original algorithm
is clearly a nonsensical piece of code with no plausible value for the
real world. There aren't many places where performance of a trivial
loop doing trivial arithmetic on guaranteed same-type value is actually
important. (except perhaps in signal processing type work)
But if one has to use such pointless code at least it should be
producing realistic semantics and since Smalltalk treats numbers as
proper objects and not some dememented 'primitive type' with restricted
capabilities then the java code must be written to do the same.

For a more meaningful comparison one must use bigger programs with plausible
behaviour. Not that I've actually looked, but I can't say I've ever
actually seen a java program that does anything useful. Please, please
don't anybody send me a list though - I really don't care.

tim