Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> [Dependent types] can by used to guarantee things like "applying 'concat' to a list of length X and list of length Y returns a list of X+Y".

As a curious bystander, what else? When are dependent types useful in practice? How do they fit into software development practices such as testing and DRY?

As an example of what I'm looking for, I see regular static typing as a way to move a class of runtime errors to compile time errors, increasing your confidence in aspects that otherwise would need repetitive unit tests.



One interesting aspect, at least for us old systems guys where runtime is important, is that you can not only move errors into compile time, you can move validation checks into compile time.

Simple example: returning the head of a list is partial; normally, you have to have a check if the list is empty and return an error. With dependent types, you can guarantee that the list is not empty, moving that test to some higher level where it's actually meaningful.

If you're manipulating arrays, you can guarantee at compile time that the indices have the right values, moving the tests outside of a function that gets called often. (In fact, when I was playing with ATS, this is what I did: I picked one boring validation test in the code at a time and moved it into the type. It vastly simplified the code.)


> Simple example: returning the head of a list is partial; normally, you have to have a check if the list is empty and return an error. With dependent types, you can guarantee that the list is not empty, moving that test to some higher level where it's actually meaningful.

You don't need dependent types for that. In particular, non-empty lists are isomorphic to a tuple containing a value for the head and a list for the tail:

  type NonEmptyList<T> = Pair<T, List<T>>
Note that this is just the type of cons ( https://en.wikipedia.org/wiki/Cons ). We can use this to define List recursively (representing an empty list (AKA nil) as the single value of type Unit):

  type List<T> = Either<Unit, NonEmptyList<T>>
If we inline the definition of NonEmptyList, we get the usual definition of List<T> = Either<Unit, Pair<T, List<T>>>


Internally, the Idris2 compiler uses deBruijn numbering for its terms, which is tricky to get right once you start trying to manipulate terms. To help with this, it keeps a list of the names corresponding to the deBruijn indices at the type level. The type checker catches mistakes as you write code to manipulate them (inserting names, removing names, or renaming things).

I ran through some exercises teaching this technique[1]. My take-away as a beginner was that it was a pain to make the compiler happy and my code that the compiler didn't like was actually wrong.

I remember reading someone had written a functional red-black tree (with proof of correctness leveraging dependent types) and said the delete algorithm naturally followed from the types. That has was not my experience (delete wasn't immediately obvious to me), but I need to find some time to give it another go.

The Idris2 docs have an example of a lambda expression interpreter[2] whose terms have a vector of the types in the environment as a type parameter (erased at runtime), to ensure everything is type safe in the resulting interpreter.

Typescript has added some bits of dependent typing. For example, asserting a function returns true if the argument has type X, that a function throws if the argument is falsy, and I think type level string manipulation. They are fairly pragmatic, it's not a research language, so I presume that specific functionality is useful to a lot of people. I've used it a little bit of it to help the type checker infer non-null after my assert function was called.

All of that said, I think people are still exploring how dependent types can be used in practice.

[1]: https://github.com/edwinb/SPLV20/blob/master/Code/Lecture2/E... [2]: https://idris2.readthedocs.io/en/latest/tutorial/interp.html

There is a copy of #1 at github.com/dunhamsteve/SPLV20 that is updated to work with the latest Idris2.


TypeScript does not have dependent types. It has type-level constants and surprisingly powerful type manipulation facilities, including string manipulation on string type constants. But there's no way to operate on types at the "dynamic" expression level - they're deliberately erased, to the point that transforming TypeScript to JavaScript is almost trivial.

For example, take the very first example of a dependent type in the Idris tutorial - a function `isSingleton` that consumes a `Bool` and produces `Nat` if you give it `True` and `List Nat` if you give it `False`; it then uses `isSingleton` as part of the type of `mkSingle`, a function that consumes a `Bool` called `x` and produces an `isSingleton x` - that is, a `Nat` if `x` is `True` and a `List Nat` if `x` is `False`. There's no way to write this in TypeScript. You could have `isSingleton` return dynamic constructors instead, but you can't use that `isSingleton` as part of the type of `mkSingle` to generate a return type. Your `mkSingle` would be stuck with the signature `(x: boolean): number | number[]`; which branch of the return type you end up with depends on `x`, but there's no way to make TypeScript properly understand the connection.


> Internally, the Idris2 compiler uses deBruijn numbering for its terms, which is tricky to get right once you start trying to manipulate terms

I'm lost. Why does it use them; what does it get from doing so? Also if it's internal why would you as a language user (vs implementer) ever encounter them?


Dependent types allow for more nuanced statements made by types. You can for instance have a lists that are sorted in ascending order as a type, or implement messaging protocols at the type level where you can only construct a reply in the message exchange if you've received the correct preceding message.

It's very cool, but it's very expensive to compile, which is why it's still research stuff, I believe.


Not only “compile expensiveness”, but you quite literally have to prove your types in the program, which is seldom easy.

E.g. if you were to write a merge sort, and have it in the type that it will output a sorted array, you will have a very hard time actually proving it correct, akin to algorithm analysis.

Though to be fair most dependent typed languages allows for types without proofs, and having it in the type that an array is sorted for example can be valuable even without formal verification.


One way to think about it is that you can extend your testing to prove the correctness of your program up to some notion of what “correct” means. You can reason about your program in your program and that allows you to capture another class of errors than regular static typing does. In principle, a strong enough dependent type systems allows you to encode mathematics, and you can then prove thing about your program in the same language as your are writing it. As for practical examples, you should look at Idris and some of the talks given about it. They do very fancy things while still being somewhat performant. Besides errors and testing, having dependent types also allows your IDE to reason about your program and can give much stronger hints and code completion.

But I agree that it is often a hard sell for normal developers. Given the choice, I would happily write large code bases in a dependent language, but I am also very biased having worked a lot in them.


One really nice "everyday" example that comes from Idris that I really like is the ability to typecheck a printf function. You provide the format string, and then the typechecker will know what to expect from the rest of the arguments!

Usually that is implemented in C compilers, so that "printf("%d", 2.3f)" becomes an error, but with dependent types you can do it on the user level in library code.


And here is a gist that demonstrates how simple it is to implement such a type-safe printf function:

https://github.com/mukeshtiwari/Idris/blob/master/Printf.idr


It seems to be doing string ops at compile time and emitting a suitable function eg.

   format ('%' :: 'd' :: cs ) = FInt ( format cs )
Does that make it a bit like template metaprogramming if TM could disassemble strings?


Idris uses the same language for run-time and compile-type; and for value-level and type-level. There's no need for a separate template language, or even a separate language for writing types. There's nothing string-specific; it's just using ordinary functions from the standard library :)


Intriguing! When working on its own programs does it interpret or compile the functions?


(Note that I've not used Idris 2.x yet, so some of this might be out of date)

The type-checker uses an interpreter. In fact there are a few steps involved:

- The "elaborator" translates Idris code into a simpler, more verbose intermediate language (an extension of lambda calculus). Idris programs can actually do metaprogramming/codegen by controlling this elaborator: http://docs.idris-lang.org/en/latest/elaboratorReflection/el...

- The type checker can beta-reduce (AKA interpret) expressions which occur in types. The rules of this interpretation are a little different to normal Idris evaluation: in particular, Idris is call-by-value, whilst the type-checker isn't (I think it's lazy?). For example, say we had a program like this:

  -- This lets us write types like 'Equal Int 1 1' and 'Equal String "hello" "bye"'
  data Equal: (t: Type) -> t -> t -> Type where
    -- This is the only constructor for such 'Equal' types. Notice that it can
    -- only return values of type 'Equal t x x', i.e. values only equal themselves
    refl: (t: Type) -> (x: t) -> Equal t x x

  -- A simple implementation of natural number addition
  data Nat: Type where
    zero: Nat
    succ: Nat -> Nat

  plus: Nat -> Nat -> Nat
  plus zero     y = y
  plus (succ x) y = succ (plus x y)

  -- A proof that 0 + n = n for all n 
  zeroLeftIdentity: (n: Nat) -> Equal Nat (plus zero n) n
  zeroLeftIdentity n = refl Nat n
Let's type-check this 'zeroLeftIdentity' function:

Its return type is 'Equal Nat (plus zero n) n', and its return value is 'refl Nat n'. The definition of 'refl' tells us that value has type 'Equal Nat n n', so this function is only well-typed iff 'Equal Nat (plus zero n) n' is the same as 'Equal Nat n n'.

These two expressions are not "intensionally equal" (i.e. they aren't the same sequence of symbols). However, Idris is a bit more lenient, and will accept 'beta-equivalence' (i.e. running them gives intensionally-equal results). In this case, we could run the function call 'plus zero n' to see if it returns 'n'.

However, there's a problem: Idris (like most programming languages) is call-by-value, so it evaluates all of the arguments of a function call before evaluating the function body. For example, consider how the following program is evaluated:

  times zero     y = zero
  times (succ x) y = plus y (times x y)

  double x = plus x x
  square x = times x x
  doubleThenSquare x = square (double x)


  doubleThenSquare (succ zero)
  square (double (succ zero))
  square (plus (succ zero) (succ zero))
  square (succ (plus zero (succ zero)))
  square (succ (succ zero))
  times (succ (succ zero)) (succ (succ zero))
  plus (succ (succ zero)) (times (succ zero) (succ (succ zero)))
  plus (succ (succ zero)) (plus (succ (succ zero)) (times zero (succ (succ zero))))
  plus (succ (succ zero)) (plus (succ (succ zero)) zero)
  plus (succ (succ zero)) (succ (plus (succ zero) zero))
  plus (succ (succ zero)) (succ (succ (plus zero zero)))
  plus (succ (succ zero)) (succ (succ zero))
  succ (plus (succ zero) (succ (succ zero)))
  succ (succ (plus zero (succ (succ zero))))
  succ (succ (succ (succ zero)))
This evaluation strategy doesn't work for our 'plus zero n' example, since we don't know the value of 'n'. Our type must be correct for all n, which includes values like 'zeroLeftIdentity (doubleThenSquare (succ zero))'. We just saw that call-by-value evaluates arguments first, which would have the type 'Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))', which would reduce like this via call-by-value:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (plus zero (square (double (succ zero)))) (square (double (succ zero)))
  ...
  Equal Nat (plus zero (succ (succ (succ (succ zero))))) (succ (succ (succ (succ zero)))) 
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))
Notice that the call to 'plus zero _' was the last thing we evaluated, since call-by-value requires that its arguments ('zero' and 'doubleThenSquare (succ zero)') be fully-evaluated first. Going back to the general case, where we don't know what 'n' is, we cannot evaluate the type 'Equal Nat (plus zero n) n' any further; it's "stuck".

However, we can use a different evaluation strategy, for example call-by-need (AKA "lazy evaluation", as used by e.g. Haskell and Nix). In that case we evaluate function bodies before evaluating their arguments (treating arguments more like local variables). If we use call-by-need, the above reduces like this:

  Equal Nat (plus zero (doubleThenSquare (succ zero))) (doubleThenSquare (succ zero))
  Equal Nat (let y = doubleThenSquare (succ zero) in y) (doubleThenSquare (succ zero))
  Equal Nat (doubleThenSquare (succ zero)) (doubleThenSquare (succ zero))
  Equal Nat (let x = succ zero in square (double x)) (let x = succ zero in square (double x))
  Equal Nat (let x = succ zero; x2 = double x in times x2 x2) (let x = succ zero; x2 = double x in times x2 x2)
  ...
  Equal Nat (succ (succ (succ (succ zero)))) (succ (succ (succ (succ zero))))
Notice in this case that we don't actually need to fully-evaluate these expressions: they matched the required pattern by the third line! Since the actual value of 'n' is irrelevant for those first few lines (it's just getting passed around unevaluated), that works for all n, which is what we need, i.e. 'Equal Nat (plus zero n) n' reduces (lazily) to 'Equal Nat n n', and hence the type of 'refl Nat n' is beta-equivalent to 'Equal (plus zero n) n', and hence zeroLeftIdentity is well-typed.

Hence type-checking tends to use call-by-need (or the less-efficient call-by-name) evaluation, since it prevents expressions involving unknown variables from "getting stuck" unnecessarily; even when the actual language we're checking uses some other strategy, like call-by-value.

PS: Regarding the actual code outputted by the compiler: the Idris compiler has multiple targets, including C, Javascript, etc. Recently it's added an interpreter based on Chez scheme, which I think is actually faster than compiling to C.


That's an amazing answer, thanks! This is going to take a bit of time to digest but much appreciated.


I was gonna say that's possible in Rust and also isn't a big problem but the more I think about it the more this comes up.

First, languages where a compiler plug-in or linter do this don't really count, because you can't extend it and they rely on specific tools. Even Rust, which has relatively good meta-programming, is very cumbersome to use. Writing proc macros is known as an archaic skill that – even though it is technically also in Rust (which is nice) – involves heavy manipulation of ASTs and has numerous pitfalls. It would be better to leave the ASTs for compiler folks if possible.

It get the sense that these features can be really useful for deserialization, which is a very common use-case in regular software development. Even where it's safe, like in Rust, deserialization is prone to logic errors and often requires lots of boilerplate (unless you use an existing library but then you must rely on unintelligible proc macro wizardry).


It's not possible in Rust. Yes, you can write a macro for it or use the existing one, but now you get all the disadvantages of macros.

For instance, you cannot (to my knowledge) manipulate a macro with another macro. You can't use a macro and feed it with the format macro to get another macro. With types you can do that (obviously).


This is also possible in typescript which is a little more mainstream than Idris.


TypeScript's type system is not consistent though.


> As an example of what I'm looking for, I see regular static typing as a way to move a class of runtime errors to compile time errors, increasing your confidence in aspects that otherwise would need repetitive unit tests.

Dependent typing just makes it easier to do that, by making the type system more expressive. For example a map where the values are of different types, but where the type for each key is fixed, is something that people do all the time in dynamic languages, but it's very cumbersome to express that to a static typechecker without dependent types, so even in a statically typed langauge people usually make do with a Map<String, Object> and have to use unit tests to confirm the behaviour.


IIUC, you can use it to ensure you never index outside the limits of your static array at runtime, this check being done at compile time.


One example I like is a binary search Funktion which only allows to pass a sorted list


Well, you can do that in traditional languages as well, just wrap your list in a SortedArray type.

Chances are your “sorted” function won’t actually prove the algorithm used formally either in a dependently typed language.


I have a different pov about this. It's an intellectual need to encode logic at the type level. It's not showing lots of practical value yet.. just like most interesting and generic techniques in the mainstream today.


It's a practical need. It allows for the compiler to enhance your productivity by stopping mistakes and errors as soon as possible.

For instance, you can't even write a generic "window slide" function support dynamic window length without this typelevel logic. Except of course if you don't mind your program to blow up on a mistake or write excessive tests for every piece of your code.


Well, can it spot race conditions? Or will it suddenly solve the halting problem?

Safety is not a binary value. Dependent types can increase safety, though they do come with a significant cost, which may very well be spent on additional testing, indeed.


> Safety is not a binary value

Exactly. Encoding logic in types doesn't have to fix all problems all the time. It's sufficient if advantages outweight the costs.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: