Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rust has a philosophy of easy machine and human parsing. ECMAScript-style `(…) => …` requires that you get to the => before you know whether the (…) is a parenthesised expression or function arguments. With `|…| …`, you know you’re dealing with a closure immediately because there’s no unary pipe operator.


What's the reason for including ease of machine parsing? Aside from the fact it might make the compiler slighty easier to write I guess.

I'm sure that code that's easy for machines and humans to parse is harder for humans to parse than code where humans are the first class citizens.


Many languages require arbitrary lookahead in order to parse code. (Some are even worse, e.g. C++ requires a full compiler to parse it because of the question of whether `(a<b, c>(d))` is calling a generic function a, or doing two comparisons; and Perl is unparseable, you’ve got to run it before you can figure out how certain things should be parsed.)

Rust’s philosophy has been to avoid that, and make it simple to parse, because that helps tooling and people alike.

Most people don’t see why this is a big deal. Here’s why: humans parse code when they’re reading it in a very similar way to how computers do. This holds true of natural language parsing, too, and is much better documented in that industry. If it’s hard or takes longer for a computer to parse it, it’s very likely to be hard for a human to parse it.

If you have fat arrow syntax, you need to skip ahead on the line to find it to confirm that what you’re looking at is a closure. Pipes say from the very start “this is a closure”. This could be said to be why Rust uses the `let` keyword too, which is theoretically unnecessary (though you might need something to replace it in a small fraction of cases). Not only does it make parsing way easier for machines (in a way that makes extending the language grammar later much easier too, but that’s part and parcel of the parsing philosophy), it makes the intent of the line immediately clear to humans.


I agree to avoid look ahead etc, however, that could be solved more consistently when you look holistically at the beginning of your language design. Considering Rust was fresh restart (similar to C#), I expected more consistency.

But maybe I should take a step back, considering my limited Rust knowledge :)


There is at least one compelling argument for making new languages easy for machines to parse: it makes it easier to write good supporting tools.

We don’t just use a compiler or interpreter any more. We also use editors and refactoring tools and debuggers and profilers and style checkers and source formatters and static safety analysers and diff tools and… If the developers of these tools don’t have to worry so much about the mechanics of parsing the source, that leaves more resources to spend on making each tool useful.

To some extent you could achieve the same benefit by making a library available to parse the source and return some sort of annotated AST or similar data structure. However, unless you can provide an easy way to call that library from every language someone might want to use to write a tool, avoiding unnecessarily complicated parsing still seems advantageous.


As you alluded to in your 3rd paragraph, that is exactly the problem that "language servers" [1] aim to solve. The API is JSON-RPC based, which is callable from any tool. If every modern language shipped with a canonical language server, machine parseability becomes a non-issue since the parser would be the same as that of the compiler's.

For instance, VS Code already uses language servers to support autocomplete/refactoring for languages like Python and they're pretty zippy (I used to have concerns about speed, but in practice it's a non-issue). Those same language servers are also supported in Vim, emacs, etc. The editor itself doesn't need to know anything about the underlying language. And it looks like a Rust language server exists:

https://rls.booyaa.wtf/

In fact, to take the idea further, the goal of the Roslyn [2] project is to expose APIs to the compiler itself in order to provide services to external dev tooling. Imagine a third-party generic debugger being able to tap into compiler or runtime internals and provide services around them it without explicitly knowing anything about the underlying language.

[1] https://en.wikipedia.org/wiki/Language_Server_Protocol

[2] https://en.wikipedia.org/wiki/Roslyn_(compiler)#Architecture


It probably won’t surprise you to learn that I had LSP in mind when writing my previous comment. :-)

I think there is a wider issue here than the (still very useful) scope of LSP, though. If you are building a tool that depends heavily on the semantics of the source language, beyond common operations like “go to definition”, you might need more than a standardised language server can provide, and then you’re back to the position I described before.


I had the sense it was on the tip of your tongue since the way you described it matched to a T, but I decided to flesh it out for folks who hadn't heard of LSPs. :)

You're right, at this moment, the LSP is circumscribed in that it doesn't expose the AST (even though it could) which means one is limited to the capabilities it does expose. The claim is that the goal is parity across languages and exposing the ASTs is contrary to this [1]. I'm hoping this philosophy is challenged. In the mean time one might have to rely on languages themselves exposing their ASTs, eg. Python is able to expose its ast (via the ast module).

So yes, right now one wouldn't be able to write say an IntelliJ IDEA type IDE based off a language server alone -- one wouldd have to be able to create the AST oneself.

[1] https://github.com/Microsoft/language-server-protocol/issues...


> And it looks like a Rust language server exists:

There are two, actually. That one, and https://github.com/rust-analyzer/rust-analyzer which is quickly supplanting it.

The intention is that the compiler will eventually be a language server. Someday.


> What's the reason for including ease of machine parsing? Aside from the fact it might make the compiler slighty easier to write I guess.

Most of the designers have spent a long time writing C++, and C++ is famously extremely hard to parse for machines, and this has caused a lot of problems for compiler/language authors. It's normal for people to overcorrect when they have real trouble with something.

> I'm sure that code that's easy for machines and humans to parse is harder for humans to parse than code where humans are the first class citizens.

I'm not sure at all about that. Making syntax that is easy for machines to parse mostly means that your syntax needs to be wholly context-free and low-lookahead. Both these features help people too, even if they result in more different kinds of characters in your source code.


what about `fn(args) => exp`? Can be simplified as `fn arg => exp(arg)` for simple ones..


That works for JavaScript, because functions are just functions.

But in Rust, closures are different from functions, though they all implement the Fn, FnMut or FnOnce traits. Functions don’t close over any state (so you can cast them to a 'static function pointer), while closures can. Using the fn keyword for closures would thus muddy the conceptual waters—though you could argue that the names of the Fn* traits has already done that—and potentially hinder future language design.

I feel a stronger argument is that it’s also markedly longer and more visually noisy; `x.map(|x| …)` is seven characters shorter than `x.map(fn x => …)` and eight than `x.map(fn(x) => …)`, and to some extent words are more distracting than symbols.

Pipes isn’t perfect, but given the consistent philosophies of the Rust language, I am not aware of any better option. But it’s definitely a fairly subjective matter.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: