Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Types Are Anti-Modular (gbracha.blogspot.com)
57 points by swannodette on June 5, 2011 | hide | past | favorite | 45 comments


From the Haskell reddit discussion:

> It's a simple truth, indeed a tautology, that if you want the compiler to check the consistency of interactions between modules, then the compiler must know the information that it is checking. That's all that's being said here. It really shouldn't surprise anyone. If you dispose of typing, then someone still needs to know that relevant information in order to write correct code; but it is the programmer, rather than the compiler.

or, with more wit,

> Types are anti-modularity in the same sense stop signs are anti-transportation.

http://www.reddit.com/r/haskell/comments/hs39c/gilad_bracha_...


Indeed. Types are useful because they span module boundaries. I've been working in Scala and C++ lately after working for ten+ years exclusively in dynamic languages and I can't tell you how happy I am to let the compiler catch my dumb mistakes. IMO, dynamic languages optimize for the rare need for freeform metaprogramming at the expense of the common need for easily maintained code.

Dynamic language enthusiasts will tell you that testing is better than static typing but in my experience static typing eliminates about half my test code and gives me much more immediate and precise feedback on my most common errors.


The blog post isn't about correctness - it's about modularity. Lispers and Smalltalkers are used to programming with very small compilation units - individual methods and individual functions. Being able to fix running programs w/o going through the whole compile cycle is a demonstrable productivity gain.

Types and modularity are at odds. The antagonism is readily apparent in languages that support dynamic and static typing - Qi. It's RCEPL (Read-Check-Eval-Print-Loop) has serious restrictions in comparison to its standard REPL.


> Being able to fix running programs w/o going through the whole compile cycle is a demonstrable productivity gain.

Indeed! Removing many sources of runtime failures at compile time is also a demonstrable productivity gain. So good thing that static typing is no barrier to dynamic extension or modification -- however, you must defer type checking of splice points between components until later stages. (This was my PhD thesis).


I am far, far more productive in Scala than I am in Clojure, and I've put about 5x more effort into learning Clojure. Small bugs in a dynamic language often mean hours of writing tests and stepping through code instead of fixing a simple compiler error and Scala's type system gives me the confidence to refactor often and aggressively to keep my code clean.


The act of verifying types defeats modularity.

I think static typing is grand, but Gilad has a valid point.


This is a serious "citation needed" assertion. What language doesn't verify types? Python does. Lisp does. ("TypeError: cannot concatenate 'str' and 'int' objects" or "wrong-type-argument: stringp 42")

I think the problem that people run into with types is when storing values for future use. Imagine that you have a data structure that looks like:

   data Foo = Foo String
And then later on in your application, you have a function like:

   frobnicate :: IsString a => a -> a
Then, you write a program that looks like:

   program :: String -> String
   program = frobnicate . preprocess . Foo
This is all well and good as long as you only want your program to operate on Strings. But if you want to change the type of your string data (perhaps to a more efficient "ByteString", or something like that), then you are kind of out of luck. You pigeonholed yourself into using String when you defined Foo as "Foo String", and now you have a lot of work to do to feed that ByteString to the function that actually cares about the type, "frobnicate", which can already handle ByteStrings.

The problem here is not type checking, but rather that you built your program in terms of an abstraction that was not generic enough, namely the "Foo" type. If you were just using Python, then your program would have been:

   class Foo:
      def __init__(self, string):
          self.string = string;

   def frobnicate(whatever):
       return whatever
In this case, there would be no trouble using whatever type you wanted as the "string", because you decide what type is OK at "frobnicate" time, instead of at compile time.

The downside is that with your too-specific Haskell program, you know it's too specific before you run it (because the compiler whines at you), but with the Python program, you may only find out when your pager starts making a loud noise at 3 AM, because Python has no idea what you're trying to do.

(Personally, having worked on a huge Python app that does no type checking anywhere, I have to say I hate that technique. I prefer what I do in Perl, which is to type check on object creation, but use type classes instead of concrete types. You can still "late bind", but you also get errors when you're making a mistake rather than deferring a decision. You can do the same thing in Haskell, too, but with even better guarantees that you won't be waking up in the middle of the night to unfuck your code.)


I already said I'm not talking about correctness only modularity. Early/eager verification breaks modularity.


I would say, "bad program design breaks modularity", not early typechecking. If your types are defined so that they are as broad as possible (i.e., "what you really want"), then you will never even notice there is a type system, until you try something impossible.


I suppose turning type errors into type warnings would satisfy both camps. That way you can experiment with incremental changes and then fix up the rest of the program when you are ready. Erlang + dialyzer works quite nicely like this.


Um... NO. This argument applies equally to functions and constants and is equally wrong.

Types are not anti-modular: the culprit is module systems that don't allow type parameterization. Systems such as Java solve this problem for functions and constants by allowing code to be parameterized over interfaces. Bam, you can compile code using said functions and constants separately from their implementation.

The solution for types is exactly the same: move types into interfaces. Guess what, Objective Caml and Coq do this via module functors and it's glorious. If my airline scheduler module needs to compile separately from my graph module, well I can write:

module Scheduler(Graph: sig type graph val traverse: graph -> (string -> unit) -> unit end) = struct ...blabla... end

and I can compile that code without ever writing a snippet of the graph module -- not even defining the ADT!

TLDR: this problem has been solved 30 years ago by people who actually use typed languages.


Thanks for pointing this out. I was about to believe the post was correct, but it seems it isn't. However, just to be sure I understand you correctly:

Is the module functor compiled only once, or once for every parameter type? I'm asking because if it's compiled only once, the generated code might be significantly less optimized. This might not play a role for complex parameter types, but could miss lots of good optimization opportunities for simple types such as int or char. How is this solved in those languages?

Also, is it just OCaml and Coq, or is this possible with other strongly typed languages (such as Haskell) as well?


> Is the module functor compiled only once, or once for every parameter type?

Only once. It does in fact miss optimization opportunities in OCaml (where ints and float arrays are given special treatment). In theory this can be overcome with whole-program optimization. The MLton compiler for SML (which also has functors) is such a compiler, though I do not know how it handles this particular case (I would presume it does).

I know of no other languages that allow type modularity beyond the ML family (including various derived formal logic systems). Racket (PLT Scheme) provides "units" which are a similar idea to functors but use contracts instead of types (since Racket is currently untyped). The analogue of a type in Racket is a type membership function from any value to boolean; this function can be used elsewhere in the unit. See for example: http://docs.racket-lang.org/guide/Signatures_and_Units.html


Haskell's module system isn't very advanced. In fact, it about as simple as it could get. This is pretty well known known and there are proposals floating around to add more advanced features. But there generally isn't much need and, as you mention, adding things like module type parameters prevents many cross-module optimizations (which GHC does quite aggressively).

In practice, it doesn't seem to cause much of an issue, but this may just be down to how Haskell tends to be used.


Indeed, typeclasses in languages such as Haskell and Mercury can be (ab)used for many of the same uses as functors. I've used both and there are merits to each, but I still can't put my finger on when one is better than the other.


Oddly enough, I think this functorized style is what Bracha advocates. If I understood him correctly in an interview he gave on FLOSS Weekly, his Newspeak language forces you to program in (the dynamically typed equivalent of) the style you describe: each module takes as parameters all modules which it depends on. (He would point out, I think, that ML doesn't force you to adopt this style of writing modules...)

Maybe he discovered module functors on his own and is unaware of their use in statically typed languages such as ML? I don't know, but it certainly seems that this solution is just as good in a statically typed setting as in the dynamically typed setting.


  TLDR: this problem has been solved 30 years ago by people
  who actually use typed languages.
You have no idea who Gilad Bracha is, do you? In the face of people like him, a bit more modesty is appropriate. You may want to consider that maybe he isn't so obviously wrong, because he knows his subject matter very well.


Copy-pasting a comment on the original article here, since I have the same comment/question:

Brian said...

Dynamic typing doesn't restore the modularity- it simply delays checking for violations. Say module X depends upon module Y having function foo- this is a statement which is independent of static vr.s dynamic checking. This means that that modularity is lost- you can't have a module Y which doesn't have a function foo. If module Y doesn't have a function foo, the program is going to fail, the only questions are when and how.

What I don't get is why detecting the error early is a disadvantage. It's well known that the earlier a bug is detected, the cheaper it is to fix. And yet, all the arguments in favor of dynamic typing are about delaying the detection of bugs- implicitly increasing their cost.

6/05/2011 6:49 AM


The article's just claiming that the modules of dynamic languages can be compiled independently; not that it will be correct.

Just my thoughts: There's a bigger question here about types, though. I think the argument is that the ceremony of types gets in the way more than it helps, and it requires up-front design. Many people like dynamic languages for their flexibility; and when you get bugs, you have to fix them anyway. There's an assumption in typed languages that we can design our types well enough upfront to help solve the problem.

It's interesting how types create a dependency on the interface (i.e. if you use a type, you depend on it). One of the ideas of Abstract Data Types was to reduce dependency on the internal implementation details; but you're still dependent on the external interface. If you change the interface, it creates problems (e.g. unit testing claims to give you confidence to refactor, but if you change interfaces, you also have to change the tests...). EDIT added "claims to"

This seems intrinsic to modularity, and I can't really see any solution to this; except for ways to make it easier to cope with interface change. Some insights may come from webapp API's, where the dependency is more explicit (you have to send and receive serialized messages).


Maybe you got me closer to the answer, and you certainly seem to have more experience on this subject than I do. I am still missing some part of the picture, so let me ask:

Types: You think about the problem up-front to the best you can. Most likely you later change the interface. The compiler now breaks the code and forces you to fix it.

Dynamic: You are flexible, which means you can start your work without having to think up-front (Q1: How is this good?). Later when you change the interface (which may exist only in your mind, but it does as far as the modules are to interact with each other in some way at all), the code still breaks but the compiler won't tell you.

In either case, as you said "you have to fix them anyway". So Q2: How is compiler not being able to complain about broken code (and possibly unit tests finding it) helping?

One comment here cited an example where one can get away without introducing errors while still changing the type ("f(g(x)"). But that is readily achieved with statically typed languages like C++ as well.


Q1, you're assuming you can design upfront; that is, that you understand the problem, you have enough information to solve the problem, the world won't change and so on.

Q2, the ceremony of types itself has a cost. You have to think about it, type it - and you may need to change it. When types ripple through layers of calls, the changes do too.

Note: I don't know the answer. I notice that dynamic languages seem to be becoming more popular (are they? or are they just more publicized?) In a rapidly changing world, it's more important to adapt now than perfect. There's a general trend, that because computers get faster but humans don't, more and more of the work will be transfered to the computer. Dynamic languages do this in the sense that they are less efficient than static languages - on the assumption that speed is the main benefit of static languages. Certainly, the coder has less work to do. Another benefit is that it makes coding accessible to less skilled people (and more skilled people who have less time to devote to a particular task).

There are trade-offs. The first thing is to note what the trade-offs are. The second thing is to note what groups of people use programming languages and for what tasks. The third thing is to ask how those people value those trade-offs, for those specific tasks. e.g. Perhaps sometimes, a crappy, hard-to-maintain, only partially correct solution now is better than a high-quality, clear, correct solution too late?


Web APIs often have explicit versioning. Perhaps that approach could be adopted for programming, where you have the option to specify the version of a function you are using. This would enable gradual migration and adoption of new features.


> If you change the interface, it creates problems (e.g. unit testing gives you confidence to refactor, but if you change interfaces, you also have to change the tests...).

Could you elaborate on this point? If you change the behavior of a dynamic module A, i.e. its implicit interface, you'd have to change its unit tests to reflect these changes anyway...


Yes, that's correct. I was addressing the selling point of unit tests, that they give you a safety net to ensure your changes don't introduce new bugs. The claim fails when the interface changes.

In my experience, interface changes happen quite often. When you prevent them from changing, you end up with back-compatibility hell, that popular platforms like x86, Windows and Java have to maintain - and that's just for external, public interfaces.


Traditionally, academic research has been funded by NASA and the Department of Defense. Under these contexts (space transport, war, etc.), the cost of a runtime failure is extremely high, counted in number of human lives or billions of dollars. Also, throughout history, most software was written once, shipped, then used. The possibility of a hotfix was practically impossible. This is where the adage that "the earlier a bug is detected, the cheaper it is to fix" came from.

In those contexts, this is absolutely true. If you're in space, and your software decides to vent your oxygen, you're screwed. And so it makes sense to invent all kinds of static compiler checks that try to eliminate all possible bugs. Type-checking, for example, can guarantee the elimination of an entire class of problems.

However, what if your context were different? What if your context was a web startup?

Suddenly, the cost of a runtime failure is not so high. With a simple drop-in plugin like Hoptoad, I can be notified of an error, diagnose, fix, and deploy oftentimes before the user can even email me describing what problem he/she had.

In fact, it's much worse than that. As you said, "dynamic typing doesn't restore the modularity- it simply delays checking for violations". This delayed checking for violations has extreme value in the startup context.

Take, for instance, yesterday's post: Show HN: Hacker News Instant (3 hour project) http://news.ycombinator.com/item?id=2621144 If you were one of the first to try it, it wasn't long before you realized that typing a space in your search threw the app into an infinite loop, making it unusable. But I think a simple comment in the discussion thread summed it up perfectly: "I don't think the bug makes it any less noteworthy".

The fact that the author could create a prototype -- a minimum viable product -- in just 3 hours, meant that he could post it on HN and get feedback on it. He could get an idea of whether the app was worth pursuing... or better off scrapping to pursue something else.

In the startup context, you can think of writing software as sketching. You just write enough code to convey to people what your product is and how it can help them. The code may be completely broken, calling functions that don't even exist. It doesn't matter. If no one ever tries to use some specific edge case of one specific feature (or hell, your entire product), that code will never be needed. And so, the fact that it doesn't make sense, doesn't matter.

By checking for as many kinds of "violations" as possible early in the development process, you're forcing the developers to spend time making it all right from the start -- before release. In the space shuttle context, a runtime failure may have critical consequences. But in the startup context, failure to release on time may have critical consequences. Releasing before your competitor may make or break your entire business. Or maybe it's your personal project and life gets in the way and you just end up never releasing at all. Personally, a lot of the joy I get out of making software, is seeing people I know benefit from using it. But if I don't see that, I'm much more likely to give up on it entirely.

People say that when a bug is found, you have to fix it, regardless of whether you're doing static or dynamic checking. They then conclude that it's a no-brainer that you'd rather have this happen sooner than later. But no one ever talks about the cost of fixing bugs. Fixing everything up front, the way static checking forces you to do, unnecessarily increases your costs if that feature eventually gets scrapped. And in startups, this happens all the time.

In the startup context, there is often an excess of good ideas. The problem is, you don't know which one will be the jackpot and you only have enough time and money to pursue a small handful of them. One approach is to do a rough sketch of as many ideas as you can, see which ones start to get traction, and then flesh out the finer details on the ones that do. The ones that don't get traction are scrapped.

Now, if you were using a tool that said every part of your program must make sense and be free of violations before you can run it, you would have spent all that time fixing bugs in features (and possibly entire projects) that no one ever used. The opportunity cost of this is you were not implementing 10 other great ideas. On the other hand, if you were using tools that allowed bugs to be present along-side working code, you would be free to choose when to polish something when it was a priority to you.

Also, in government-funded projects for the space program, you essentially have all the time and funding you could want from the start. Your goal is to use that funding to eliminate all possible runtime errors, possibly pushing back the release to do as much of this as possible. For the most part, it's okay. You'll just get more funding.

However, in the startup world, it's the exact opposite. If you're a poor developer trying to make it big, you have no money now. But if you can prove your app is valuable and can make lots of money, then and only then will investors consider giving you money for it. Before funding, you're lucky if you have one full-time developer. Only after funding, when the project has to have already proven itself, do you have the funds to pay the developers you need to stomp out all the bugs.

It's not that static is better than dynamic or dynamic is better than static. It all depends on what you're doing. You have to choose what makes sense for what you're doing in the context you're doing it in.

The idealist in me wants to eliminate all bugs in code I write before release. And even more so in the code other people write. I totally get that. But the pragmatist in me (which only developed after having to write real production code for a real startup that pays my actual bills) knows that sometimes it makes sense to choose to not fix bugs. Static checks tell me "no, fix them now". Dynamic checks empower me to choose what I need.


Static typing doesn't have to force you to fix them now, just statically-enforced typing. An ideal (and achievable) language would be capable of checking types statically but still allow you to run programs that contain type errors.

I know virtually every static type system in use today doesn't do this, but it still bothers me when people just give up and advocate dynamic typing for this reason.


I find this absurd. I don't even call them bugs when I get compiler errors. They are typos. They are trivial to fix, and I'd much rather fix them at 3pm before I've checked in the code than at 3am when it's live or 5 minutes before a presentation. If you really want to sketch a function, just make a stub that throws. It will only be 3 lines long.


Indeed, haskell has a standard library function for this:

    bar z = foo z + baz z
    foo = undefined
    baz = undefined


A trivial example is always trivial to fix. And you can raise an unimplemented exception in just about every language.

But what I'm talking about is usually when the code works correctly at first. Then you come along 6 months later to this code that you didn't write and know nothing about to add some new feature. Along the way, you change a few things. And that breaks something subtle. A parameter becomes nil that used to be something. Or a key in a map is no longer created.

Static type systems would say, "if a parameter or map lookup may be nil, you must wrap its uses with a case to handle it". Which makes a lot of sense in the scope of this one function.

But if I don't care about the case when the parameter is nil -- when it is, it usually means something much bigger is wrong or I simply don't care about having that feature working anymore -- why should I spend the time to track it down to handle it? It's not something trivial like a method is undefined. It's more like, a method is not defined on the specific instance of a class, whose methods get defined at runtime with metaprogramming, so it takes some tracking down. Sure when it's just you, and your entire code base is 300 lines, it doesn't matter. But when it's a large project and you have shit to get done in order for your company to get paid that day, whether or not your function handles the nil case that no one ever uses simply isn't so important.


This article confuses the reader with a weird implicit definition of modularity. The miscommunication is rooted here:

> Separate compilation allows you to compile parts of a program separately from other parts. I would say it was a necessary, but not sufficient, requirement for modularity.

The author goes on to assert that a requirement for modularity is that modules that depend on each other should be able to be compiled independently, without even a specification to glue them together. This just doesn't make sense. This might be a debatable point, but it has nothing to do with modularity.


Lack of types gives the illusion of more modularity.

As was mentioned -- there are still dependencies. The only difference is that, without types, checking whether the parts fit together is deferred until later -- but it will still be done. (All parts of software have to be connected, and all have to be consistent.)

And therefore you could argue that types are pro modular: because they allow you to write separate pieces without needing the whole to test if it is OK -- which you cannot do without types.


Existential types? The author alludes to this, and then proceeds in ignorance of what he'd just said for the rest of the post.

In Java, all you have for these is interfaces, so it's no wonder people think that types are problematic.

Types aren't the problem. Crappy types are the problem. Not making the types lightweight enough that people can dish them out at their pleasure, that's the problem. When the math don't work well enough, make better maths.


Except you can't unpack existential types, right? http://stackoverflow.com/questions/2300275/how-to-unpack-a-h...


Unpack in this sense means "unbox", and you can specialize before you hide the type.


I disagreed strongly with this post and was about to post a rebuttal, but then I realized there actually is a situation where untyped languages enjoy a concrete, practical advantage because of the lack of compilation dependencies. That situation is when you are writing a module (something that would be a compilation unit in a typed language) that calls missing or nonexistent library interfaces, and you want to test parts of the module that do not depend on the missing APIs. In a typed language, you can't do it. You would have to satisfy the dependencies first so your code would compile, or you would have to copy snippets of your code out of the compilation unit into a REPL or another file so they could be compiled separately. In an untyped language, you can experiment with parts of the module and even start writing tests without having to make sure all of the module's dependencies are satisfied.

Aside from that one advantage, though, it's the same with types or without them. For a module as a whole, the compilation dependencies that exist in a typed language are the same as the runtime dependencies that exist in an untyped language. Whether you're using a typed or untyped language, you can write whatever code you want, but you can't do anything with it until you satisfy its dependencies. With that one exception, of course: in an untyped language, you can run code that doesn't depend on the missing APIs but happens to be in the same module as code that does depend on them.

Overall, I don't think that amounts to much, especially compared to a typed language like Scala where you can copy snippets into a REPL. Or even better, Ensime is an Emacs mode that is supposed to let you work with Scala the way Slime lets you work with Common Lisp. (I haven't tried it myself yet, but it exists, and people are using it for real work.)


From p. 3 of the paper "Why Dependent Types Matter":

"We have given a complete program, but Epigram can also typecheck and evaluate incomplete programs with unfinished sections sitting in sheds, [···], where the typechecker is forbidden to tread. Programs can be developed interactively, with the machine showing the available context and the required type, wherever the cursor may be. Moreover, it is Epigram which generates the left-hand sides of programs from type information, each time a problem is simplified with ⇐ on the right."


Yeah I really hated this post. Starts off with a flame-baiting title. Then resorts to very qualified statements to support it. After reading it I'm wondering if it matters at all and I'm feeling dumb all throughout the discussion. :-(

Update: Aarghh, it's just aggravating to read an academic stirring up the pot over some point of theoretical purity. And I love theory too. ;-)


It's the worst kind of theory: a theoretical point presented as a practical insight without any explanation of what the practical ramifications actually are.


This is how Gilad rolls.


I do not see a single reason why strongly typed language compiler couldn't have unsafe flag, to compile with undefined symbols, and to make your program fail/rise error/perform some other action when such symbols are evaluated or called.

And I understand why this behavior would be unwanted in production systems.


Note that there is an interesting counter-argument in the following HN comment:

http://news.ycombinator.com/item?id=2622491

This argument is also applicable to your concrete example, and explains how you can achieve modularity in a strongly typed language even in that case. The type system just has to be flexible enough to allow for module functors.

In other words, the original article's argument holds only for strongly typed languages whose type system is too primitive.


> (...) in an untyped language, you can run code that doesn't depend on the missing APIs but happens to be in the same module as code that does depend on them.

So this advantage is about writing code that isn't intended to be compiled or run yet. I can do that in a statically typed language too: I just put the code in a comment block.

Or am I missing something here?


Nope, you're not missing anything. That's another valid workaround, which undermines the practical advantage of untyped languages yet a little bit more.


If g returns an x and f takes an x, I can do f(g()) without knowing anything else about type x. Even type inference will choose an x that makes g compatible with f, or complain if there isn't any. So the system is still modular at the source level, which is important because it keeps the project comprehensible. What you don't have is binary-level modularity, but that's just an efficiency hack that dates back to when computers were almost too slow to run compilers. Now it should only matter to people who obfuscate by refusing to ship source (or source-equivalent intermediate files, which at least one Ada and one C++ implementation used to handle separate compilation of polymorphic code) and living with the lack of whole-program optimization.


Why not structural typing?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: