It seems to be doing string ops at compile time and emitting a suitable function...

chriswarbo · on Oct 17, 2022

Idris uses the same language for run-time and compile-type; and for value-level and type-level. There's no need for a separate template language, or even a separate language for writing types. There's nothing string-specific; it's just using ordinary functions from the standard library :)

zasdffaa · on Oct 17, 2022

Intriguing! When working on its own programs does it interpret or compile the functions?

chriswarbo · on Oct 18, 2022

(Note that I've not used Idris 2.x yet, so some of this might be out of date)

The type-checker uses an interpreter. In fact there are a few steps involved:

- The "elaborator" translates Idris code into a simpler, more verbose intermediate language (an extension of lambda calculus). Idris programs can actually do metaprogramming/codegen by controlling this elaborator: http://docs.idris-lang.org/en/latest/elaboratorReflection/el...

- The type checker can beta-reduce (AKA interpret) expressions which occur in types. The rules of this interpretation are a little different to normal Idris evaluation: in particular, Idris is call-by-value, whilst the type-checker isn't (I think it's lazy?). For example, say we had a program like this:

  -- This lets us write types like 'Equal Int 1 1' and 'Equal String "hello" "bye"'
  data Equal: (t: Type) -> t -> t -> Type where
    -- This is the only constructor for such 'Equal' types. Notice that it can
    -- only return values of type 'Equal t x x', i.e. values only equal themselves
    refl: (t: Type) -> (x: t) -> Equal t x x

  -- A simple implementation of natural number addition
  data Nat: Type where
    zero: Nat
    succ: Nat -> Nat

  plus: Nat -> Nat -> Nat
  plus zero     y = y
  plus (succ x) y = succ (plus x y)

  -- A proof that 0 + n = n for all n 
  zeroLeftIdentity: (n: Nat) -> Equal Nat (plus zero n) n
  zeroLeftIdentity n = refl Nat n

Let's type-check this 'zeroLeftIdentity' function:

Its return type is 'Equal Nat (plus zero n) n', and its return value is 'refl Nat n'. The definition of 'refl' tells us that value has type 'Equal Nat n n', so this function is only well-typed iff 'Equal Nat (plus zero n) n' is the same as 'Equal Nat n n'.

These two expressions are not "intensionally equal" (i.e. they aren't the same sequence of symbols). However, Idris is a bit more lenient, and will accept 'beta-equivalence' (i.e. running them gives intensionally-equal results). In this case, we could run the function call 'plus zero n' to see if it returns 'n'.

However, there's a problem: Idris (like most programming languages) is call-by-value, so it evaluates all of the arguments of a function call before evaluating the function body. For example, consider how the following program is evaluated:

  times zero     y = zero
  times (succ x) y = plus y (times x y)

  double x = plus x x
  square x = times x x
  doubleThenSquare x = square (double x)


  doubleThenSquare (succ zero)
  square (double (succ zero))
  square (plus (succ zero) (succ zero))
  square (succ (plus zero (succ zero)))
  square (succ (succ zero))
  times (succ (succ zero)) (succ (succ zero))
  plus (succ (succ zero)) (times (succ zero) (succ (succ zero)))
  plus (succ (succ zero)) (plus (succ (succ zero)) (times zero (succ (succ zero))))
  plus (succ (succ zero)) (plus (succ (succ zero)) zero)
  plus (succ (succ zero)) (succ (plus (succ zero) zero))
  plus (succ (succ zero)) (succ (succ (plus zero zero)))
  plus (succ (succ zero)) (succ (succ zero))
  succ (plus (succ zero) (succ (succ zero)))
  succ (succ (plus zero (succ (succ zero))))
  succ (succ (succ (succ zero)))

This evaluation strategy doesn't work for our 'plus zero n' example, since we don't know the value of 'n'. Our type must be correct for all n, which includes values like 'zeroLeftIdentity (doubleThenSquare (succ zero))'. We just saw that call-by-value evaluates arguments first, which would have the type 'Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))', which would reduce like this via call-by-value:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (plus zero (square (double (succ zero)))) (square (double (succ zero)))
  ...
  Equal Nat (plus zero (succ (succ (succ (succ zero))))) (succ (succ (succ (succ zero)))) 
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))

Notice that the call to 'plus zero _' was the last thing we evaluated, since call-by-value requires that its arguments ('zero' and 'doubleThenSquare (succ zero)') be fully-evaluated first. Going back to the general case, where we don't know what 'n' is, we cannot evaluate the type 'Equal Nat (plus zero n) n' any further; it's "stuck".

However, we can use a different evaluation strategy, for example call-by-need (AKA "lazy evaluation", as used by e.g. Haskell and Nix). In that case we evaluate function bodies before evaluating their arguments (treating arguments more like local variables). If we use call-by-need, the above reduces like this:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (let y = doubleThenSquare (succ zero) in y) (doubleThenSquare (succ zero))
  Equal Nat (doubleThenSquare (succ zero)) (doubleThenSquare (succ zero))
  Equal Nat (let x = succ zero in square (double x)) (let x = succ zero in square (double x))
  Equal Nat (let x = succ zero; x2 = double x in times x2 x2) (let x = succ zero; x2 = double x in times x2 x2)
  ...
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))

Notice in this case that we don't actually need to fully-evaluate these expressions: they matched the required pattern by the third line! Since the actual value of 'n' is irrelevant for those first few lines (it's just getting passed around unevaluated), that works for all n, which is what we need, i.e. 'Equal Nat (plus zero n) n' reduces (lazily) to 'Equal Nat n n', and hence the type of 'refl Nat n' is beta-equivalent to 'Equal (plus zero n) n', and hence zeroLeftIdentity is well-typed.

Hence type-checking tends to use call-by-need (or the less-efficient call-by-name) evaluation, since it prevents expressions involving unknown variables from "getting stuck" unnecessarily; even when the actual language we're checking uses some other strategy, like call-by-value.

PS: Regarding the actual code outputted by the compiler: the Idris compiler has multiple targets, including C, Javascript, etc. Recently it's added an interpreter based on Chez scheme, which I think is actually faster than compiling to C.

zasdffaa · on Oct 18, 2022

That's an amazing answer, thanks! This is going to take a bit of time to digest but much appreciated.