Some optimizations to consider. Better letrec compilation. In a tight loop, the overhead of the unboxing & closure unpacking that letrec and lambda introduce can be significant. But, you can avoid them if you see a letrec-bound function with no free variables (besides itself). Specifically, you can turn it directly into an L4 function. To make this work, you introduce a label version of the function name (replacing uses in the body) and just put the function as a new top-level L4 function. Your code must also then treat the label as if it was a primitive, i.e., a call-site with a label in the function position does not change and a label in some other place in the program has to get wrapped in a lambda. For example: (letrec ([fib (lambda (n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))]) (fib 30)) => ((:fib 30) (:fib (n) (if (< n 2) 1 (+ (:fib (- n 1)) (:fib (- n 2)))))) Also a little trickier, but possibly also profitable is if you identify functions where all of the call sites of the function have all of the closure variables in scope (in general, only the site where the lambda expression is has those variables in scope), then you can transform them so they get their closure variables via regular arguments which then makes them suitable for the above transformation. One example would be a function like this: (let ([a (new-array 100 0)]) (letrec ([init-array (lambda (i) (if (< i (alen a)) (begin (aset a i i) (init-array (+ i 1))) 0))]) (init-array 0))) If you transform this function so that 'a' becomes a new argument, you can pass it around at all call sites and then this function becomes eligible for the transformation above. Inlining. Some functions are doing relatively simple things, to the point that the code that deals with the calling and returning can dominate in your compiled code. Fib is one such program; if you look at the L1 output of a compiled fib program (starting from the L4 program above, say), you'll find that, as a percentage, lots of the code is dealing with function calls and returns and relatively little is doing addition. To get a better ratio, you can do function inlining. Specifically, if you see: (f e ...) in your program and you know that "f" is only ever bound to one specific function, you can replace the call-site with a let to bind the arguments, followed by the body of the function. More precisely, if 'f' is bound to (lambda (x ...) e_body), then you can replace the call above with (let ([x e] ...) e_body) For fib, doing inlining twice (once for each recursive call) and then cleaning up a little bit, yields this: (letrec ([fib (lambda (n) (if (< n 2) 1 (+ (let ([n (- n 1)]) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))) (let ([n (- n 2)]) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))))))]) (fib 30)) The three dangers here are 1) that you should be careful to terminate, 2) that your register allocator will get a real workout if you do too much inlining (the register allocation is the only part of the compiler that isn't linear and inlining can slow it down substantially), and 3) that if you don't specifically avoid it, you can end up with bad nesting in `if` expressions that can lead to exponential blowup in the L4->L3 compiler. Also, if you do choose to implement this process, a good way to think about it is to think about equations that capture observably equivalent expressions and are always a good idea as transformations, e.g.: (+ (+ (+ x 1) 1) 1) = (+ x 3) ((lambda (x) e1) e2) = (let ([x e1]) e2) and then some that capture observably equivalent behavior but are only sometimes a good idea: (let ([x e1]) e2) = e2{x:=e1} i.e., replace all free occurrences of x in e2 with e1 This one is only sometimes a good idea because it can cause your program to explode in size. So do it only if it doesn't do that (and watch out for bad 'if' creation too). Note that this last one is valid ONLY if you know that e1 is "sufficiently simple". In particular, if e1 is (print 5), then this is NOT valid, e.g.: (let ([x (print 5)]) (begin (print 6) x)) =/= (begin (print 6) (print 5)) because the print statements come out in the opposite order. In contrast, in this expression: (let ([x (+ 1 y)]) e) it is always safe to substitute (no matter what "e" is) because the expression (+ 1 y) never does an IO and never has any errors and always terminates. In general, it is not possible to tell if some expression has those properties, so just make your compiler approximate that by knowing that certain expressions are always safe (and applying the rule for those) and others it isn't sure about (so not applying the rule for those).