Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
1ML – unifying ML into one language (mpi-sws.org)
175 points by g1236627 on April 22, 2015 | hide | past | favorite | 60 comments


I know this is not popular. I think ML has a clean syntax and simple but powerful semantics. I enjoyed coding in it. It just looks better than OCaml. Haskell is, I think, even cleaner, but there is a tendency for developers to explore the possibilities available in Haskell and then increase the complexity.

It would be great to see the ML community unify and provide an offering that would give it the kind of support that we are seeing in OCaml and Haskell.


Are you referring to Standard ML, rather than the ML family of languages? 1ML is a new language in the ML family, rather than an extension of SML or OCaml.

Consolidating the module and expression languages in 1ML will lead to even cleaner semantics. From the abstract of the Andreas Rossberg paper:

> In this “1ML”, functions, functors, and even type constructors are one and the same construct; likewise, no distinction is made between structures, records, or tuples. Or viewed the other way round, everything is just (“a mode of use of”) modules.

Haven't had a chance to look at the demo yet, but hopefully functions/records/tuples can be sugared over syntactically into something resembling traditional ML. Otherwise we might end up with something like Java where there isn't much abstraction from the underlying OO mechanism, which makes code tedious and prone to boilerplate -- eg all functions (methods) must live in a class, even just to run main.


Every time I think of unifying various language features like this, I end up imagining that I'd just reinvent LISP and look like a fool...


That's often true within dynamically typed languages, but statically typed languages definitely don't feel Lispy. Moreover, messing with static type systems involves much more maths than hacking.


It is perfectly possible to write/use a lisp that is statically typed and still feels like a lisp. AFAIK the closest thing to this currently is shen http://www.shenlanguage.org/.

None of the things that make a lisp a lisp prevent static typing. Some common constructs could be hard to figure out the types for, but they figured out the types of transducers so I doubt it is impossible.


I think that attempt was called Scheme.


explore the possibilities available in Haskell and then increase the complexity.

I'd caution against referring to all such explorations as complexity. Complexity is a highly overloaded term in our field. Sometimes it refers to the number of steps a given algorithm takes to compute (as a function of the input). Sometimes it refers to the depth and breadth of a program's syntax tree as well as its tendency to branch out and create cycles.

Sometimes it's mistakenly used to refer to concepts which are in reality simple but merely unfamiliar or non-intuitive. This last usage is a big problem for languages outside the mainstream which are trying to find better ways of writing software.


What the...? You know full well what kind of complexity the GP refers to, and it's not cyclomatic or algorithmic complexity, and definitely not unfamiliarity.

Is this a new kind of HN trolling? Instead of directly disagreeing with an argument on it's merits, you call a commenter out on possibly ambiguous terminology and use that to dismiss the entire comment?

Your entire lecture is true, but it does not apply to the sentence you quoted at all. That sentence is about some Haskellers' tendency to sacrifice readability in favor of better static checks. He could've written "difficult to understand" instead of "complex" and his comment wouldn't have changed meaning and your entire comment would've been void.


> I'd caution against referring to all such explorations as complexity. Complexity is a highly overloaded term in our field.

The difference between complex & hard, easy & simple has been put very elegantly by Rich Hickey in Simple Made Easy [1]. That doesn't mean everyone agrees with his definitions, which is why he revives the word "complected" to mean objective interleaving of concepts, and pulls out "hard" from the way people use complex to mean something one is unfamiliar with. I like his definitions, so I use them. :)

> Sometimes it refers to the number of steps a given algorithm takes to compute

This can still create ambiguity since it could be either time or memory complexity, but still easy to infer, especially if there's a big O.

> depth and breadth of a program's syntax tree

Lisp overloads the parens for difference concepts, which is complex. This could also be hard if one's not familiar with the syntax.

> tendency to branch out and create cycles

Sounds like time complexity!

> Sometimes it's mistakenly used to refer to concepts which are in reality simple but merely unfamiliar or non-intuitive.

This is the ambiguity, is he saying Haskell complex because it has a lot of interleaving with it's concepts, that other languages do not? Or is it just unfamiliar? I would think it's simpler because it forces one to think about how time interleaves the program, which could make things harder! I'm guessing this is what the grand parent means, since ML is impure. Though, either case is empty without examples.

[1] http://www.infoq.com/presentations/Simple-Made-Easy


Yeah, I've seen that presentation. Rich's ideas were what I had in mind when I wrote my reply.

In general, use of highly overloaded words is ambiguous in these discussions.


Accidental complexity is one of the biggest enemies of productivity in the programming world.


I taught myself OCaml first (err... 15 years ago? when there was some initial buzz around it) and liked it.

But then I discovered SML/NJ and I liked it more.

OCaml has an active community, and the language is nice. I think I actually prefer ML as a language.


I should add I also gave Haskell a spin and did not like it. While I can see its elegance, I found ML more readable and practical and easier to write. My eyes glaze over in Haskell, it's like it's missing punctuation or something to let me know what it's doing.


Totally agree with you own this one, just looked into Haskell code this week and it was too much black magic going on. I like ML (and more specifically OCaml) because it is more readable to me and easier to understand. Now this might change if I get more exposure to both worlds.


I too prefer ml (I tried sml/nj) over ocaml, for it is very very tiny. It almost pushed my sexp/lisp fanatism away for a while.


I couldn't agree more. The power / complexity ratio for SML is very high.


Not sure if this mirrors his approach exactly, but ocaml does now offer 'first-class modules'

https://realworldocaml.org/v1/en/html/first-class-modules.ht...


First class modules let you package up a module in the module level language into a value, so you can pass it around in the expression language. But 1ML's goal is to eliminate the separation between the module language and expression language. So, 1ML can do first-class modules, but first-class modules do not mean a syntactically un-stratified language.


Alice ML, an SML variant that Andreas Rossberg worked on mid-2000, also has first class modules.


Somewhat related: Bob Harper's, Future of Standard ML[1]

[1] http://www.cs.cmu.edu/~rwh/talks/mlw13.pdf


Are the problems this is intended to solve problems that also affect F#?


F# doesn't even have modules; although it's derived from an ML, it has cut out rather important parts of it to make it fit into the .net ecosystem.


You're not wrong, but I may phrase it differently...

F# lacks modules, instead favoring objects. Although it's derived from OCaml, it has omitted complex features which don't fit in to the .net ecosystem.


but is there say, something that (S)ML's modules allow which is not possible to do with F#+.NET objects?

I.e. ML's modules can be used to create interfaces and implementations of ADTs, but so can OO interfaces and classes.


I tried to build it (Ubuntu 14.04), but fails (OCaml 4.01.0) after calling make:

File "types.mli", line 11, characters 14-15:

Error: Syntax error

make: * [types.cmi] Error 2


It builds with 4.02.


Reminds me of 1Malaysia, Malaysia's political rallying call to unite multiracial Malaysia :)


[flagged]


No. He does not mean unifying separate languages or dialects into one standard.

Rather, the point is that the existing ML language has two parts that are pretty unrelated: the normal value-level language where you write your actual programs, and the module-level language you use for organizing your code and types.

Think of it like the distinction between C proper and the C preprocessor. They're part and parcel of the same language in practical terms, but really they're two completely unrelated languages grafted into one. The ML situation isn't nearly as bad, of course, but the idea is the same: wouldn't it be nice to have one language for both?

There are some ad-hoc efforts to bridge the gap between the two (extensions for first-class modules), but they are not enough. His solution is a language design that starts from first-class modules and, roughly, ensures the module language is just a special application of the base language. It radically simplifies the design and exposes the minimal underlying abstraction which is a typed variant of the λ-calculus called Fω.


This actually doesn't have to do with that. They "propose a redesign of ML in which modules are truly first-class values, and core and module layer are unified into one language."

They're not trying to, say, unify SML and OCaml, they're trying to solve a problem inherent to ML itself.


Can you elaborate on why these are problems for ML?


One of the classic motivating examples (which is addressed in the 1ML paper) goes like this:

Both trees and hash tables can be used to implement map-like data structures. You could, in ML, have an abstract signature for Map types that gets implemented sometimes by a concrete TreeMap implementation and sometimes by a concrete HashMap implementation. HashMap would be a better choice if the expected number of entries is larger, and the TreeMap would be better if the expected number of entries is smaller, so what we'd like to do is write something like

    module Map = if size > threshold then HashMap else TreeMap
so that we choose what concrete implementation we want at runtime—but we can't really do that, because the language we use to talk about modules in ML is distinct from the language we use to talk about values. That is to say, the size > threshold part can't coexist in the same expression with the HashMap and TreeMap part. Some MLs have added the ability to wrap modules in values, so you can write it this in OCaml:

    module Map = (val (if size > threshold
                         then (module HashMap : MAP)
                         else (module TreeMap : MAP))) : MAP
but it's a bit awkward because of the explicit moving-back-and-forth between value-level and module-level, and the interaction between the two languages has some rough edges (which the paper explains more thoroughly, if you're interested.)

The motivation for 1ML is that we'd like to use the same language to talk about both modules and values. That way, we could write the first, simpler definition without having to worry about the fact that we're manipulating distinct 'things'. Of course, there are other tradeoffs involved in the 1ML solution, but it's an interesting, compelling experiment.


how does that end up working for typechecking? does the Map module end up having all the properties of the intersection of these elements?

The Map example seems pretty easily solvable via Haskell through typeclasses or the more standard OOP-y languages through interfaces, but am I missing something? Does SML not have the tools for this right now?


There are very much ways of expressing that computation in SML—or in Haskell or other languages—but not in the language of modules. Haskell typeclasses would have a similar problem. Consider the following pseudocode:

    class Map m where {- some implementation -}
    
    instance Map HashMap where {- ... -}
    instance Map TreeMap where {- ... -}

    mkMap :: Int -> {- ??? -}
    mkMap size | size > threshold = newHashMap
               | otherwise        = newTreeMap
What is the type of mkMap? It won't actually type-check in Haskell. It can't be (Map m) => m, because it needs to have a concrete type. We could do some kind of type-level work, relect the size variable into the type level, and have a type family that decides whether the resulting type is a HashMap or a TreeMap—but that has the same problem as the OCaml example, where we're using a different language (the language of type-level computation) to talk about something that'd be simple with value-level programming.

One place where you could write this easily is in a dynamically typed language:

    def mkMap(size):
        if size > threshold:
            return HashMap()
        else:
            return TreeMap()
That is exactly why this kind of research is interesting: we'd like to capture the flexibility of idioms that dynamic languages can sometimes afford us, with the extra safety and security guarantees of a static type system!


You don't need a dynamic language for that. In Java:

    Map<K,V> mkMap(int size) {
        return size > threshold ? new HashMap<K,V>() : new TreeMap<K,V>();
    }


Thanks for the explanation, this is illuminating

I'm still kind of unsure of why (Map m) => Int -> m isn't a valid type signature for this though..


That type means something slightly different: (Map m) => Int -> m doesn't mean that it returns some Map type, it means it returns any Map type. We could write a function of that signature, but that would imply that the caller/context would choose what type of Map it returns (and so it would not depend on the integer argument.)

Think about what it means if a type variable has no typeclass constraint: the type t isn't just a specific unknown type, it's literally any type. Similarly, C t => t isn't a specific unknown instance of a typeclass, it's any instance of that typeclass. If we tell Haskell that we're returning an instance C t => t and then try to give some concrete type, Haskell won't allow it, because it wants that type to encompass any possible type that implements C, not just the one we happened to use.

One way of making the above code work in Haskell is to use an existential type, which expresses that we're talking about some (not any) map type, but we'd have to wrap it inside a different constructor:

    -- this expresses that a SomeMap constructor contains some kind
    -- if Map, but we can't tell which one:
    data SomeMap = SomeMap (forall m. Map m => m)

    -- We can now express this by hiding the choice of Map inside the
    -- SomeMap type:
    mkMap :: Int -> SomeMap
    mkMap size | size > threshold = SomeMap newHashMap
               | otheriwse        = SomeMap newTreeMap
which is, again, a solution with some advantages and some disadvantages.


If I remember correctly, isn't this one of the problems Idris and Agda are good at solving?


We could in fact write mkMap in Idris with very little change, and give it a proper type:

    -- This might not be the best/cleanest way of writing it, as my
    -- Idris is a little rusty, but this will work:
    mkMap : (size: Int) -> (if size > threshold then HashMap else TreeMap)
    mkMap size with (size > threshold)
      | True  = newHashMap
      | False = newTreeMap
So yes, although again, other tradeoffs are made to allow this kind of radical expressiveness.


Thanks for pulling this out and explaining it; although I'm not familiar with ML, I get what the problem is here. Interesting!


Erasing that distinction enables some goodies such as first-class type constructors, which makes it possible to manipulate types of higher kind. This in turn, presumably enables things like Generalized Algebraic Data Types.


the monad example in 1ML is:

  type MONAD (m : type ⇒ type) =
  {
    return a : a → m a;
    bind a b : m a → (a → m b) → m b
  };

  map a b (m : type ⇒ type) (M : MONAD m) (f : a → b) (mx : m a) =
    M.bind a b mx (fun (x : a) ⇒ M.return b (f x)) (* : m b *)
not too bad.. though I wonder if things are defined structurally instead of nominally, which monad instance are you going to get for a given expression. Haskell's "one instance per class" rule means the compiler figure out which monad instance to apply.

This looks more flexible, but also looks like it will require more annotations and explicit parameters.


If you combine this with [modular implicits](http://www.meetup.com/NYC-OCaml/events/222026251/), which are hopefully going to be added to OCaml soon, then you'll get something which is nearly as syntactically neat as type classes but more powerful.


More power in terms of flexibility, but the mechanism of implicits makes it less clear which instance is in scope for a given expression.

With true typeclasses, there is no ambiguity which instance will be selected (there can only be one)

This distinction is often lost when comparing true type classes with their emulation via implicts.

Sometimes "more power" is not what you want i.e. this is more of a tradeoff.


The proposed OCaml implementation of modular implicits[1] considers ambiguity a compile-time error, so in that case you'd have to manually indicate which module you're passing in. (This is in contrast to Scala, in which there's an elaborate mechanism for resolving ambiguity in implicits which makes it hard to know which is being selected.)

So the OCaml implementation shouldn't make it any more difficult to discover which implicit is in use in a given context, because if an implicit is used, then it must necessarily be unique and there will be no ambiguity.

[1]: http://www.lpw25.net/ml2014.pdf


the section "6.5 Modular type classes" was very interesting. Although they did mention some restrictions as well.


You have to explicitly pass the monad "vtable". If 1ML also grabs up modular implicits then you'll have a sensible way of passing implicit arguments not exactly unlike that of Haskell, but modularity and global canonicity are at odds so you have a more complex reasoning task.


ocaml already has generalised algebraic datatypes as of 4.0:

http://caml.inria.fr/pub/docs/manual-ocaml-400/manual021.htm...


The idea is not to unify the different dialects of ML, but to unify the "core" and "module" sublanguages of ML.


Sick of seeing that thing, "middlebrow dismissal" that it so often is.


And it's as retarded as always. Of course no one should try, because it might not work. Also, as others pointed out, this is the wrong context.


Scala, whose modularisation was strongly influenced by ML, already offers most of what 1ML is trying to do.


Could you justify this assertion? Here[1] is an analysis by Andreas Rossberg that indicates that this isn't the case.

[1] http://stackoverflow.com/a/23019436


I was thinking about the statement

   1ML is a user-friendly surface syntax for System Fω 
in the paper's abstract. I think Fω lives inside Scala. But, as the stackexchange article you cite shows, I should have been more careful in my statement.


Scala is complex, and SML is pretty simple. (I haven't studied 1ML beyond the elevator pitch, which claims it's "minimal and uniform".) A very complex language with a large installed base can't offer simplicity.


Even more than being complex, it is also a JVM language. That has both benefits and drawbacks. Interaction with the Java language and runtime brings complexities to it.

Granted, I haven't used Scala since, err, 1.7 days? I think. I liked it, and did a lot of work in it in fact, but more as an alternative to Java.

If I didn't have to worry about compatibility with Java or the proven nature of the JVM, or satisfy sysadmins and management with the orthodox Java-ness of my runtime, I'm not sure I'd pick Scala.


I'd never have left the ML world had it not been for the lack of an ecosystem that's comparable to what the JVM has to offer. And my Scala programs mostly just use ML-like features, but I can't live without all those rich libraries that are lacking in ML.


Yeti is pretty nice if you have to work on the JVM: http://mth.github.io/yeti/


That project looks dead, the last commit was November 2013?


The last tagged release was November 2013. The last commit was this week.


Last time I looked into OCaml it seemed to have most of everything I'd need for most projects.

But I think it'd be hard to find an employer willing to pay me to work in it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: