A Tiny Compiler for a Typed Higher-Order Language

grapevines · on March 31, 2015

This is similar to an idea that I've been kicking around.

It is basically a arbitrary precision beta-reducer for untyped lambda calculus. I think I theoretically would need a machine that only needs four instructions on a theoretical machine (presumably some sort of FPGA), those being write, load, flip, and meld.

I bring this up, because instead of compiling to C, you could simply use a general purpose beta-reducer, and this would allow for arbitrary precision computation instead of being limited to 32 bits or 64 bits or whatever.

It is written in Agda, meaning that it could be a compiler for any arbitrary calculus that could be translated into an untyped lambda calculus.

Unfortunately, my adviser told me that it is a side-project due to lack of theoretical grounding.

You can check it out here: https://github.com/jbodeen/-ava

asQuirreL · on March 31, 2015

This idea has been knocking around the collective consciousness of computer science for a while now. For example, the SECD machine [1], a virtual machine for functional programming languages to target. Then there are also examples of people compiling functional programming languages to various lambda calculus bases, such as SKI [2], and BCKW [3]. Through the judicious use of some extra combinators, you can also get more compact "object code"[Turner79].

One advantage of compiling to something like an SKI or BCKW basis (or a mix) is that you can represent abstraction (functions/closures) without bound variables, or scoping, which can be a pain to implement (esp. in hardware).

Take, for example, the SKI basis. Suppose you had two functions, with types as follows:

  f :: a -> b
  g :: b -> a -> c

And you wished to combine them in the natural way to get `h`:

  h :: b -> c
  h x = g (f x) x

Then, to achieve the same with S, K and I combinators we do the following (assuming we have SKI representations of `f, g` as `F, G`.

  H := S(S(KG)F)I

The `x` has disappeared! But, if we apply H to some CL expression `X` we get:

  HX = S(S(KG)F)IX
     = S(KG)FX(IX)    ;; apply S
     = KGX(FX)(IX)    ;; apply S   
     = G(FX)(IX)      ;; apply K
     = G(FX)X         ;; apply I

We can use S to Split (duplicate) parameters, and K to Kill (ignore) parameters, in a very straight forward way. Imagine the AST of the original code:

     App
     /\
    /  \
   App  x
   /\
  / App
  g /\
   /  \
   f  x

Every `App` is replaced with an `S`, every `x` with an `I`, everything else (`y`) is replaced with `Ky`. In my example, I have also gone to the trouble of "optimising" `S(KF)I` away, as it is equivalent to just `F`.

Interesting stuff, and plenty of theoretical grounding to boot. Especially as this sort of thing has started enjoying a renaissance, in the likes of "point-free" notation in languages like Haskell.

[1]: http://en.wikipedia.org/wiki/SECD_machine [2]: http://en.wikipedia.org/wiki/SKI_combinator_calculus [3]: http://en.wikipedia.org/wiki/B,C,K,W_system [Turner79]: http://courses.cecs.anu.edu.au/courses/COMP3610/assignments/... (apologies, I couldn't seem to find a version with an intact abstract).

polymatter · on March 31, 2015

Interesting stuff.

btw I think the type of your the function h is a -> c. I know its a minor typo, but it confused me while I was trying to follow.

asQuirreL · on March 31, 2015

You're right `h :: a -> c`.

dvh · on March 31, 2015

This is fourth (different) site in the last week that displays "foo.bar" as "foo . bar". What is happening!? Am I going crazy???

edit: hm, ok, so some extension was doing it, I'm glad I'm not going insane.

tel · on March 31, 2015

Not sure if this is what you mean, but while `foo.bar` tends to be "property access of the property `bar` of the record/object `foo`", in Haskell `foo . bar` is "the function `bar` followed by the function `foo`".

In the latter case it's written with spaces to look a bit more like the mathematical notation.

sanxiyn · on March 31, 2015

I find it sort of terrifying that this was written by a teenager.

tel · on March 31, 2015

I've spoken with him a bit over perhaps the last year while he graduated from high school and enrolled at CMU. I've always seen him to be very interested in programming languages, semantics, compilation. It's completely unsurprising to me that now that he's at CMU he has the opportunity to dig in (even) more deeply and produce this kind of stuff.

I'm not sure what you find terrifying about it. He's put in probably 3-4 years of work in this topical area already and the world is such that today there are plenty of high-quality resources to help you learn this material—free, online—if you have the drive. His age is perhaps a little surprising, but ultimately immaterial.

samsaga2 · on March 31, 2015

This boy has a loooot of free time (https://github.com/leachim6/hello-world).

smt88 · on March 31, 2015

If he does have a lot of free time... great! The more STEM prodigies with free time in the world, the better.

samsaga2 · on March 31, 2015

It is pure jealousy

acqq · on March 31, 2015

He's old enough to do that, from his bio:

"Work: Research Assistant at Carnegie Mellon (Sept. 2014-)"

M8 · on March 31, 2015

Haskell, BitBucket - good taste!

keenerd · on March 31, 2015

Yet the code is hosted on Github (https://github.com/jozefg/pcf) and it generates C. Worse is better.

jozefg · on March 31, 2015

I actually host the main repo on my (https://bitbucket.org/jozefg/pcf). I've just found that most people would prefer to see the GitHub version (for starring, more pleasant forking, better code viewing) so I usually link there.

I generally duplicate my repos on bitbucket/github so I can feel smug when one goes down for a while.