The Stalin Compiler

Stalin is a questionably named Scheme compiler written by Jeffrey Mark Siskind that can supposedly create binaries as fast or faster than Fortran or C for numerical problems. To test this, I tried creating a simple program to numerically integrate \sqrt{x} from 0 to 10000. To make things interesting, I used a manual Newton’s method implementation of sqrt from SICP.  The integration is done by a simple tail-recursive method.

The scheme code is very pretty:

(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      (sqrt-iter (improve guess x)

(define (improve guess x)
  (average guess (/ x guess)))

(define (average x y)
  (/ (+ x y) 2))

(define (good-enough? guess x)
  (< (abs (- (* guess guess) x)) 0.001))

(define (mysqrt x)
  (sqrt-iter 1.0 x))

(define (int x acc step)
  (if (>= x 10000.0)
      (int (+ x step)
           (+ acc (* step (mysqrt x)))

(write (int 0.0 0.0 .001))

I then converted this to C. It is pretty much a transliteration, except it uses a for loop instead of recursion to accumulate the values:

#include "stdio.h"

double improve(double guess, double x);
double average(double x, double y);

double sqrt_iter(double guess, double x){
  if( good_enough(guess, x))
    return guess;
    return sqrt_iter( improve(guess,x), x);

double improve(double guess, double x){
  return average(guess, x/guess);

double average(double x, double y){
  return (x+y)/2;

int good_enough(double guess, double x){
  if (abs(guess*guess-x)<.001)
    return 1;
  return 0;

double mysqrt(double x){
  return sqrt_iter(1.0, x);

  double rez = 0;
  double x;
  double step = .001;
  for(x=0; x<= 10000; x+=step)
    rez += mysqrt(x)*step;
  printf("%f\n", rez);

I compiled the two methods via:

stalin -On -d -copt -O3
gcc -O3 int.c

The results are:
Stalin: 1.90s
gcc: 3.61s

If you declare every method inline in C, you get:

gcc-inline: 3.28s

For reference, I also compiled the code with chicken scheme, using the -optimize-level 3 compiler flag.

Chicken Scheme: 27.9s.

Some issues:

  • The Scheme code is more appealing on the whole, though it suffers from a lack of infix notation: (< (abs (- (* guess guess) x)) 0.001) is definitely harder to read than abs(guess*guess-x)<.001.  I wonder if more familiarity with prefix code would reduce this.
  • All three methods give slightly different results, which is worrisome.  In particular, Stalin is correct to 6 digits, whilst gcc and chicken are correct to 7.
  • Stalin apparently does not include macros.  One would think it would be easy to use the macro system from a different scheme compiler, and then send the source to stalin for final compilation.
  • Stalin is extremely slow to compile.  In principle this isn’t a big deal: you can debug using a different scheme compiler.  Still, Stalin seems to be somewhat less robust to edge cases, than at least chicken scheme.
  • It is amazing that Scheme code with no type declarations can beat C by almost a factor of 2.
  • Though in principle Stalin produces intermediate c code, it is utterly alien and low-level.  I have not been able to determine exactly what options Stalin is using when it calls gcc on the source code.  That could account for some of the difference.
  • A detailed writeup on Stalin is here.

10 thoughts on “The Stalin Compiler

  1. Nice, but part of the reason the C version is slow is that you are using abs instead of fabs. abs takes an int as an argument (this should be a compiler warning!) – passing a double other than zero always returns a large number. The C version is basically iterating till the error is below double’s resolution.

    printf(“%d,%d\n”,abs(0.0001)<0.001,fabs(0.0001)<0.001); outputs 0,1

  2. Excellent point! Making the switch to fabs brings the gcc-inline speed down almost exactly to what Stalin is getting.

  3. If you really want to see stalin kick C across the room, have the benchmark call a numeric integration routine, where the function to be integrated is a parameter to the integration routine.

    If you want to see stalin kick C and then beat it while it’s down, make it a double integral by having the function to be integrated involve a call to the integration routine, i.e., nesting.

  4. Familiarity with prefix code is exactly what you need to easily read it. In my experience, after 1000 lines of prefix code, it will look as natural as infix (and you won’t have the precedence problems infix notation has).

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s