Language Design: Use 'ident: Type' not 'Type ident'

TazeTSchnitzel · on May 1, 2020

A more practical reason as a language designer is that C and C++-style type-before-name syntax is a nightmare to lex and parse, as you can't tell whether

  A * B;

is a multiplication or a variable declaration, or whether

  A<B, C> D;

is a two comparisons with a comma operator or a templated variable declaration, without first knowing the names of all declared types.

This means in practice that you have to declare types before they are used in a file, which means forward-declarations if they are defined later, that you can't separate lexing and parsing because the parser has to provide constant feedback to the lexer, and that misspelling a type name can lead to a syntax error! C++, not content with merely inheriting C's problems, throws in the “most vexing parse” as a bonus.

Animats · on May 1, 2020

It's a historical problem in C. Originally, C had only built-in types. Structs were declared with

    struct foo { int x; int y; };

which is easy to parse. Then came typedef. With user-defined types,

    foo*bar;

isn't parseable with a LALR(1) parser until you've seen the definition of "foo". Even the ordinary case

    foo bar;

needs more than one lookahead to parse. This is a headache for compilers, and a huge headache for anything that wants to work on single source files without seeing the included files.

It's a big win if files are parseable without their dependencies. The Pascal/Modula/Ada family all are. I think Go is. Not sure about Rust, but it probably is. C and C++, no.

(The hacks for template syntax in C++ are painful to think about.)

barrkel · on May 2, 2020

Yeah, the easiest way to do it is return a different token type from the lexer for type names, which means the lexer looks up the symbol table when it sees identifiers.

Amusingly, the Delphi dialect of Pascal has an ambiguity in its grammar from how it made function pointers work:

    function g: Integer;
    // ...
    f(g);

Depending on the definition of f, this could be passing the function f by reference (as a function pointer), or passing the result of calling f.

Delphi looks at the argument type to resolve the ambiguity, but it stays awkward in overload scenarios, when there's a choice between Integer or function pointer arguments.

At some point around about 2009, I added the ability to specify explicitly that you want the call, to eliminate the ambiguity:

    f(g());

lmkg · on May 1, 2020

> Not sure about Rust

Rust has made a point of being parser-friendly, to support tooling. The entire reason the turbofish ::<> exists is to avoid ambiguous grammar. (This operator seems to vex people, I don't get the hate.)

WalterBright · on May 2, 2020

Being parser-friendly is not at all the same thing as being user-friendly. The difficulty of writing parsers is way overblown.

One thing to avoid is depending on the existence of the symbol table to parse correctly. (C++ has this problem.) Needing a symbol table makes it hard for parsers like source code formatting and syntax highlighting.

Another is to avoid raw string literals that reach back and try to unwind the results of previous "phases of translation".

barrkel · on May 2, 2020

Agree with a caveat; if the syntax has many ambiguities, with resolution solved by a clever parser, it'll be confusing for users, especially newbies, because they'll tend to get errors unrelated to what they were trying to achive.

Redundancy in syntax can be a bonus.

WalterBright · on May 2, 2020

> Redundancy in syntax can be a bonus.

I agree. I even wrote an article about that:

https://www.digitalmars.com/articles/b05.html

seanmcdirmid · on May 2, 2020

Both humans and computers parse code. Humans can handle ambiguity mode easily than computers, but that doesn’t make the ambiguity more user friendly somehow.

estebank · on May 2, 2020

Do note that Rust the language is regular and easy to parse, but rustc the compiler actually performs bounded lookahead and contextual parsing on parse errors to detect common typos and invalid syntax that would otherwise be vexing to users. The nightly only type ascription : syntax is a good example because it crops up in several different contexts as a 1-substitution typo and used to be terrible (may it burn in a fire). The turbofish is another case where we now suggest the correct syntax, but there are many, many more.

usrusr · on May 2, 2020

Surely that's an example of how parser-friendly syntax empowers tooling? I imagine that it would be much harder to guess intention and fixes from lookahead and context if those were already necessary to make sense of correct code?

thrtythreeforty · on May 2, 2020

I love the turbofish so much, not because it looks nice in code or anything, but because the name is a bit of harmless fun that makes the language feel inviting.

TazeTSchnitzel · on May 1, 2020

Ah, C not originally having typedefs explains why the syntax wasn't thrown out as too annoying to begin with. Syntax creep I guess.

An easy solution to the problem would be to do what e.g. Haskell does where case-sensitivity determines a type name. For better or worse, C didn't do that.

lisper · on May 1, 2020

In Dylan all types were delimited with <angle_brackets>. Personally I always found that very aesthetically pleasing.

Riverheart · on May 2, 2020

Same in Powershell except they use square brackets

[string] $foo = "foo"

zelphirkalt · on May 2, 2020

In my brain the instantly translates to "list of strings". Not sure square brackets are such a good choice.

yaktubi · on May 2, 2020

On the other hand doesn’t forward declaration make sense? The code is declaring it depends on the forward declaration defined “somewhere else”. In a sense it is documenting the the classes/structs better.

BruceEel · on May 2, 2020

Years ago, I stumbled upon a Visual C++ 6.0 bug (if memory serves me right), just by playing around trying to understand C/C++ decls. It would crash on a stray:

*c;

at the beginning of a translation unit.

Didn't K&R have a tiny, tiny C declarations parser (printing out 'human readable' equivalents) example in their book? I think the caveat was that it needs to assume it's dealing with a declaration...

mariodiana · on May 1, 2020

I was just thinking about pointer syntax today, and if you ask me, there are a lot of problems that could be avoided if language designers took a page out of Guido van Rossum's book and extend his idea of forced spacing.

Take the first example you've given, and let's just talk about variable declaration:

    int* a;
    int * a;
    int *a;

That's all the same. That's wrong. Obviously, you can write a lexer and parser that doesn't give a crap, but the human mind does. If you think it doesn't, that's only because you've internalized the various cases.

It should be this:

    int* a;

The type we're talking about is a pointer to a variable of type int: in other words, "a" is an int-pointer. If you were to create a macro, you'd do something like this:

    #define int_ptr int*

Do you see what I'm getting at? The star in this case is a suffix, equivalent (in our minds) to "_ptr". Conceptually, it doesn't belong anywhere else than attached to the type. It's a compound type, conceptually.

Now, take the star being used in a different context:

    int b = 10;
    int* a = &b;
    printf("%d\n", *a);

There, though we see the same character, it's a completely different thing. It's a dereference operator. Conceptually, it belongs attached to the pointer variable it is dereferencing; and since the star is already used in one context as a suffix, here it should be used as a prefix.

This is no good:

    printf("%d\n", * a);

It doesn't matter that "this compiles." That's not what this is about.

Many C programmers (and programmers in other languages, even Python) are used to writing things like this:

    int c = x*y;

That's wrong. Sure, the lexer and parser don't care. But that makes the language worse, for the human operator. "But it saves space!" Spare me.

The thing with this one example, using the star, is that what we have is the equivalent of a homonym. We have one sign that is actually three different words. Mandating spacing removes the ambiguity you're complaining about.

C is what it is, but if we imagine someone were going to write it today, they should incorporate the above and mandate spacing. For the sake of the humans. "ident type" is not the only solution.

dllthomas · on May 2, 2020

There's a perspective that doesn't seem to be noted yet, which is that

   int *f

is declaring that

*f

will be an int.

This perspective addresses why the star belongs with the name, why it's star instead of ampersand, why you need to repeat the star for multiple variables, why the brackets goes after the name for arrays, and it will sort of get you where you need to go with function pointers (although there's an automatic promotion which means things will work at the call site that won't work for the type).

This isn't to say alternative constructions mightn't be a better choice in a new language, but it's much more parsimonious when considering C than memorizing a bunch of special cases.

kevin_thibedeau · on May 2, 2020

Then someone comes along and typedefs a pointer to hide the "scary" double pointer syntax.

dllthomas · on May 2, 2020

I don't think that actually breaks things. A typedef isn't purely syntactic substitution like a macro.

If we have

    typedef float *floatp;

then the following compiles:

    float x, y;
    floatp xp = &x, yp = &y;

while the following does not:

    float x, y;
    float* xp = &x, yp = &y;

You have defined a new type (which happens to be equivalent to an old type - note that we don't get nominal type-checking from typedefs); the above logic still holds.

cyphar · on May 2, 2020

People have been arguing about where the asterisk should be for a very long time. The main counter-argument to the construction you used is that multiple variable declaration immediately looks ugly given the current rules of C variable declaration:

    int* a,* b; // ??

And that's usually why most C style guidelines use

    int *a, *b;

Now, if we treated "int<asterisk>" as the full type name in the syntax then you get a much nicer result:

    int* a, b; // much nicer!

Though that does have its own tradeoffs. But that would require actually changing the C syntax, at which point you might as well make it the postfix syntax:

    var a, b int*;

klyrs · on May 1, 2020

> It should be this:

> int* a;

I must disagree... the following sends the wrong message the reader:

  int* a, b;

Though I also agree that dereferencing is best without a space. Perhaps the correct suggestion is to not declare pointers and instances on the same line, but that's a convenience that many seem to enjoy

tomjupiter · on May 1, 2020

I think mariodiana is describing a hypothetical language where a and b would both be int-pointers in this case.

WalterBright · on May 2, 2020

To declare two pointers to int in C:

    int *a, *b;

In D:

    int* a, b;

The use of whitespace makes the distinction clear, although the parser doesn't care.

klyrs · on May 2, 2020

C is just riddled with mistakes and this one finally gelled with me. Python taught me that whitespace is great for syntax and, in this case, I'm quite convinced that it should be used more firmly in most languages.

andreareina · on May 2, 2020

Python's syntax precludes works against a lot of things, like first-class lambdas, case (yes it can be emulated with if/elif, but it's not the same, even without fallthrough), pattern-matching in general, or assigning the result of a method-chaining pipeline to a variable.

I use python at $dayjob, and I run into the limitations of whitespace as syntax all the time.

klyrs · on May 2, 2020

Yeah, the switch thing is a mindboggle, but the associated pep makes it very clear that/why the problem is quite unsolvable.

https://www.python.org/dev/peps/pep-3103/

It's quite frustrating as there are a lot of situations where a simple switch would be so nice...

dirtydroog · on May 2, 2020

I find the opposite, I can't stand it. Give me curly braces over arbitrary indentation any day.

mhh__ · on May 2, 2020

I think a better (subjective of course) lesson might be to enforce a style.

I have been thinking about making a toy compiler (I wanted to write a borrow checker) that treats bad code as an error, solely aimed at numerical code - I have recently had "Scientific Programming in Python for physics etc." inflicted on me.

Slight tangent, but I think if Haskell enforced some kind of whitespace a la Python it would be much more approachable in real codebases (Haskell is usually quite readable if you are just translating mathematics into code but it - to me at least - feels dreadful as a productive language because of the way a lot of functions seem to be dumped onto into the text editor in a lot of the code I have read)

iso-8859-1 · on May 2, 2020

What do you mean? Whitespace does mean something in Haskell, see http://echo.rsmw.net/n00bfaq.html .

dllthomas · on May 2, 2020

Not really relevant to the broader discussion, but Haskell's whitespace sensitivity is defined in terms of automatic insertion of braces and semicolons, and you can write it that way instead if you want. A few people do write Haskell that way (SPJ maybe?) but the community is mostly united around using whitespace. That said, it's useful to know about the braces and semicolons if you're generating Haskell code, because that's often easier. Also occasionally at the GHCI prompt, where it will let you keep things on one line.

todd8 · on May 2, 2020

As in Python, whitespace in the form of indentation matters in Haskell.

mariodiana · on May 2, 2020

If I'm describing a hypothetical language, then I'm actually describing a hypothetical language where you wouldn't declare more than one variable per line.

It's been a long, long time since C required a programmer to declare his variables at the top, and best practice argues that you declare a variable as close to its use as possible. So, there really isn't the same case for multiple variables on one line, whereas it may have been a little more forgivable, once upon a time.

Moreover, when I was first learning C, I learned from an O'Reilly book by Steve Oualline. I remember him saying, when it came to operator precedence, that coding style that relied on the rules was a really bad idea. I think he said something like, "Multiplication and division come before addition and subtraction, and use parentheses for everything else."

My bottom line is C has a little too much "convenience" to it. (Granted, to my taste.) It goes back to my van Rossum comment. I'm against special cases and loosey-goosey stuff.

ufo · on May 2, 2020

The problem in C is more than just the `*`. Consider arrays, where C uses `int xs[];` instead of `int[] xs;`. There is no way to use define to declare an int_array type like you did with int_ptr. (Function types also exibit a similar problem)

a1369209993 · on May 2, 2020

There are four possible spacings to consider:

  foo*bar; // multiplies foo and bar
  foo* bar; // declares bar as pointer-to-foo
  foo *bar; // declares *bar as foo (so bar is still a foo*)
  foo * bar; // multiplies foo and bar

These are all unambigous and there are only two semantics between them. You also have:

  foo* bar,baz; // baz is a pointer
  foo *bar,baz; // baz is a foo
  foo *bar,*baz; // baz is pointer again
  foo* bar,*baz; // baz is now a pointer *to* a pointer

This is all obvious - or at worst unambigous - to a person reading the actual code (without trying to correct for the idiosyncracies of a parser), so the language should either match that or spit out a warning about unsupported spacing.

> [`int c = x*y;` is] wrong.

No, that's unambigously multiplication.

earthboundkid · on May 2, 2020

    A pointer to an int "should" be &int, not int*. That we use *, the 
    dereference operator, to indicate pointers is wrong. * in a type means it's an 
    address, but * in a value means it's not an address. That's nuts! Make it 
    consistent and use & both places. If you need to have a distinction between 
    refs and pointers, it should be that pointers are nullable refs: &int? or some 
    such.

    Edit: How to escape asterisks in HN?

mariodiana · on May 2, 2020

I think you may be right, &int would be better than int*.

I did not want to stray too far from what C does now, for the sake of argument. I just wanted to say that the ambiguity in C is often because it's so loosey-goosey with whitespace.

saagarjha · on May 2, 2020

I’ve had luck with not putting something directly after the asterisk.

earthboundkid · on May 2, 2020

Apparently HN does not have an escape character which seems like a real oversight. :-/

stormbrew · on May 1, 2020

It's possible to argue here that the problem is not so much the order of type-then-name but the overloaded meanings of * and < at the grammar level.

lilott8 · on May 1, 2020

More generally this falls into a discussion of context-specific vs context-free grammars. Of which C++ falls into the former, Java falls into the latter.

MaxBarraclough · on May 1, 2020

The grammar of C++ is not a context-sensitive grammar. It's Turing-complete, on account of its template metaprogramming capabilities.

I'd be very surprised if Java's grammar were context-free. Do you have a source for this? I wasn't able to find one with a quick search.

tom_mellior · on May 2, 2020

Pretty much every programming language has a nicely parseable context-free "rough syntax" (my term I just invented) that can be written down formally for the language documentation and a parsing tool. And then every language also has a notion of "well-formed programs", which introduces a whole bunch of additional constraints on what programs should actually be accepted by compiler frontend.

Well-formedness includes type checking. But even without full type checking that can be done later, it also includes things like being aware, in C, of whether a given identifier is declared as a typedef in the current scope. So while C has a nice context-free "rough syntax" formally specified in the standard, its actual input language is context sensitive.

As for Java, the first example that comes to mind is that constructors must have the same name as the class they belong to. This "choose whatever identifier you like, but at some later point repeat that exact same identifier" is a very typical example of something that is not context-free.

You might disagree whether this constraint is part of what you consider Java's "grammar". So the answer to your question depends on what language level you are thinking of. But whichever level you apply to Java, you should apply the same to C++. C++ also has a context-free "rough syntax" in its standard.

battery_cowboy · on May 2, 2020

It's Turing complete because it could simulate a Turing machine, not because of metaprogramming. The language brainfuck is Turing complete, for example.

saagarjha · on May 2, 2020

Checking if a Brainfuck program is well formed (i.e. can be run) is a linear time operation. In C++ this can take forever. They have different complexities.

battery_cowboy · on May 2, 2020

The original comment was about Turing completeness, and it was defined incorrectly. I was giving an example of a dead-simple language that was Turing complete, because the claim was that metaprogramming made C++ Turing complete.

wtetzner · on May 2, 2020

The claim wasn’t that C++ is Turing complete, that’s trivially true. The claim was that C++’s grammar is Turing complete. I don’t know if that’s exactly the right way to phrase it, but C++’s template expansion stuff is Turing complete.

battery_cowboy · on May 2, 2020

It was a weird phrasing to me, but i get what you were all saying now.

tom_mellior · on May 2, 2020

No, the claim was that metaprogramming made the grammar of C++ Turing complete.

battery_cowboy · on May 2, 2020

Metaprogramming in C++ is TC, but it's not what makes C++ TC by itself.

tom_mellior · on May 2, 2020

Yes, but it is the difference to other programming languages.

In C you cannot encode a Turing machine that is executed by the compiler at compile time. In Brainfuck you cannot encode a Turing machine that is executed by the compiler at compile time. In C++ you can encode a Turing machine that is executed by the compiler at compile time.

That is the difference we are discussing here.

battery_cowboy · on May 2, 2020

Yes i now realize that, thanks for explaining further. The original comment was worded in a way that i misunderstood the claim.

benibela · on May 2, 2020

>In C you cannot encode a Turing machine that is executed by the compiler at compile time.

But you can get quite close with macros

tom_mellior · on May 2, 2020

No. You would need the ability to write unbounded loops or unbounded recursion. You don't have that with the C preprocessor.

Yes, you can do a lot with the C preprocessor. You can also do a lot in languages that only have bounded loops and are therefore not Turing complete. You can either express nonterminating computations (Turing completeness), or you can't (still powerful, but dramatically less poweful). This question is binary. There is no fuzziness, there is no approximation, there is no "quite close".

MaxBarraclough · on May 2, 2020

You misunderstand. C++ is Turing-complete at compile time, due to template metaprogramming. This demonstrates that it isn't a context-free grammar.

This isn't true of all programming languages.

battery_cowboy · on May 2, 2020

Thanks, i see what you were saying now, i misunderstood your original comment.

battery_cowboy · on May 2, 2020

https://stackoverflow.com/questions/14589346/is-c-context-fr...

Context free means something different to what you're saying here. This is a good discussion of this topic, I never knew C++ was so irregular and informal.

downerending · on May 1, 2020

Arguably a programming language is ultimately a user interface, and the more intuitive the interface, the better.

In 2020, not sure we should care that much about how hard compilers have to work to achieve this. Computers and software are here to support us--we're not here to support them.

lmm · on May 2, 2020

A parser tends to get confused in the same places where a less experienced human would get confused. Making a language that's easy to parse dovetails with making one that's easy to read.

downerending · on May 3, 2020

That's a reasonable point. As long as it's the "good for humans" that driving things, this makes total sense. It's "good for computers" but "bad for humans" that needs to go away these days.

MaxBarraclough · on May 1, 2020

> In 2020, not sure we should care that much about how hard compilers have to work to achieve this.

That's not right. Long compile times are a real issue for some programming languages even today (Rust and C++).

> Computers and software are here to support us--we're not here to support them.

A false dichotomy, and not a perspective that offers any insight. Sometimes a low-level language is appropriate, and sometimes it is not.

Gibbon1 · on May 1, 2020

C++ is stuck in this trap where because it's slow to compile, the compiler maintainers increase the amount of optimizations the compiler does to make it faster. Which of course makes the compiler even slower. Which motivates them to increase the amount of optimizations. Which makes the compiler yet slower.

I feel part of the problem with rust is it doesn't have quick and dirty mode thats fast and a production ready mode that is slow but does all the checks. I do this with my C programs, using various formal analysis tools which are slow to vet the code before releasing it.

TsiCClawOfLight · on May 2, 2020

Checks in rust are fast. In fact, you can 'cargo check' to run checks without actually compiling, and that will finish in less than half a second.

Most editors did this on save before LSP came along. For logic, this is fast enough because the type system catches enough mistakes and I don't need to run tests all the time. For UI though, a faster iteration cycle would be nice...

MauranKilom · on May 2, 2020

C++ debug builds (i.e. without optimizations) are not that much faster in my experience. So it's not optimizations that are the culprit.

steveklabnik · on May 2, 2020

The checks are not the most expensive part of compiling Rust.

Gibbon1 · on May 2, 2020

What is it then?

Ar-Curunir · on May 2, 2020

The codegen is. This is partially because the IR passed to LLVM is not the best, but it’s also not terrible.

MaxBarraclough · on May 2, 2020

Does Rust allow for fast unoptimised builds then? Perhaps I'd missed that.

steveklabnik · on May 2, 2020

Faster, yes. Still lots to do.

MaxBarraclough · on May 2, 2020

You're saying LLVM is slow even when generating code without optimisations?

steveklabnik · on May 2, 2020

Yes.

The thing that will massively speed up rustc is the current re-architecting of it. We’re at the point of “few percent here, few percent there” with the current design. These add up over time, of course, but batch compilers are inherently slower than the newer style ones (after an initial compile).

Ar-Curunir · on May 3, 2020

To add to Steve’s point, Rust adds more checks during debug (non-release) more, which can makes the IR larger. So it’s not always the case that debug mode is faster to compile (though it generally is)

downerending · on May 1, 2020

I'd consider compile time as part of the UI, so in that sense I think we agree. (If the compile is long because the implementation is poor, that needs to be fixed.)

Not sure what you mean on the dichotomy. If someone says that a language needs to have X because that will make things simpler for the computer, I say that they are wrong. The goal, the only reasonable goal, is to make things better for humans.

ncallaway · on May 2, 2020

> If the compile is long because the implementation is poor, that needs to be fixed

The compile time could be long because the implementation is poor. But it's also possible that the specific requirements do not allow for a significantly faster compile time.

That's why the requirements matter. They determine the space of possible implementations. If the requirements eliminate all "fast" implementations, then the resulting user experience will be poor because of slow compile times.

MaxBarraclough · on May 2, 2020

The best example of this problem is probably SPARK, a variant of Ada which permits formal verification.

Its verifiers are awfully finicky, and tuning the parameters (including selecting the most appropriate verifier) can mean the difference between successful completion in a few seconds, and outright non-termination/timeout-with-failure.

It's true that the answer is to have better verifiers, but that's not just a matter of tweaking the verifier code, it's a serious research challenge. One of the most serious problems with formal methods is the ability to scale.

https://en.wikipedia.org/wiki/SPARK_(programming_language)

pierrebai · on May 2, 2020

This has nothing to do with order and everything to do with the chosen grammar of C and C++.

You can have an ambiguous grammar in a language that puts the type after. A * B could be ambiguous even if A is the identifier name and B the type.

All arguments in the "Language Design" notes are of the same nature.

MaxBarraclough · on May 1, 2020

I missed your mention of the 'most vexing parse' before writing this up, but for those not in the know, here's another fun quirk of C++ syntax:

Declare a local named neko of type Kitten, passing a value to its constructor:

    Kitten neko(42);

Declare a local named neko of type Kitten, without passing a value to its constructor:

    Kitten neko;

Declare a function named neko with zero parameters and with return type Kitten:

    Kitten neko();

https://en.wikipedia.org/wiki/Most_vexing_parse , https://stackoverflow.com/a/620149/

a1369209993 · on May 2, 2020

And Ceiling Cat help you if you try to declare something like:

  Kitten neko(Felis<0xCA7>::catus);

where catus may be either a typename or a variable depending which template instantiation you landed on.

There's probably some way to say:

  Kitten neko(Felis<is_function_type<typeof neko>::value_of>::catus);

MaxBarraclough · on May 2, 2020

Beautifully terrible!

raverbashing · on May 2, 2020

I think this C pointer ambiguity would be "easy" to solve with one of those minor changes:

1 - (sorry, formatting is messing this up) mandate the declaration:

   'A*' or '*B'

without space (depends if you think the pointer is a type of variable or a type of type)

2 - add some keyword like "type"

Though the compilers are able to deal with it, so it's not a big issue

Groxx · on May 1, 2020

`int * A` and `A * int` both seem ambiguous unless a) you require "int-pointer" naming instead of "pointer-int", and/or b) you have a separator (like the title uses) so this becomes `A * B` vs `A: * B` which is indeed unambiguous but in a very different way.

i.e. without extra rules you can't tell either way, so it comes down to the extra rules. `A * B` is unambiguous for type-before-name if you require "pointer-int" since it would need to be `* A B` for A to be a type.

(quite possibly there are counter-examples when you get into weirder corners, but my point is that it's not as simple as presented)

WalterBright · on May 2, 2020

D resolves the:

    A * B; // declaration or multiplication?

in an unusual way. If it was a multiplication, the result would be thrown away and be pointless. Hence, it is a declaration.

But what if A overloads the * operator and the side effects are desired rather than the result? D has a philosophy that arithmetic overloads should be for arithmetic-like operations, not I/O, template metaprogramming, or other nonsense. Hence if you try it like that, too bad, so sad, it'll be treated as a declaration.

Ididntdothis · on May 2, 2020

I wouldn’t mind if such a statement in isolation would be a compiler error and if languages in general were much stricter. So far all languages I have seen that are very forgiving in terms of syntax (PHP comes to mind) seem to breed sloppy programmers and software written in these often have stupid little bugs caused by typos.

WalterBright · on May 2, 2020

This particular issue has not caused any "silly typo disease" problems that I'm aware of in D. In fact, pretty much nobody notices it, it just works the way people expect it to.

(There are some other things in D that are designed to discourage trying to overload arithmetic operators for non-arithmetic purposes. For example, < <= > >= cannot be overloaded individually, only as a group.)

WalterBright · on May 2, 2020

Ironically the article chose the < > syntax:

    class Id<T>() {

Instead of the D syntax:

    class Id(T) {

Not using < > for template parameters eliminates all kinds of parsing problems.

ZoomZoomZoom · on May 2, 2020

I don't see it being ironic. In Rust () is always a tuple, so if you see it that way it makes sense () isn't used for type declaration.

threatofrain · on May 1, 2020

Makes me wonder why Dart went the same way. One thing I don’t like about type first is that what you want is often more specific than the type.

Izkata · on May 1, 2020

Strong disagreement here. "type ident" flows with the data during assignment, doesn't confuse the infix operators, and doesn't misuse ":" from a human-language standpoint.

For example:

  val x: String = "hello"

The type interrupts the flow of data from "hello" to x, so one thing that pops into mind is that this is typecasting the value to a string before storing it. Nope.

Another possibility I instinctively see this as is doing a comparison and assigning the result (either true or false in this case) to x. Nope.

And human-language wise, colon is "description: explanation" (or more generally: general to specific), which actually fits this syntax better:

  val String: x = "hello"

...and at that point, just remove the extraneous stuff:

  String x = "hello"

lmm · on May 2, 2020

> The type interrupts the flow of data from "hello" to x, so one thing that pops into mind is that this is typecasting the value to a string before storing it. Nope.

> Another possibility I instinctively see this as is doing a comparison and assigning the result (either true or false in this case) to x. Nope.

You can write it as

    val x = "hello" : String

if you prefer. In fact that's a great advantage of this syntax: any expression can be optionally ascribed with a type. If you write the type first then it becomes too intrusive (and too much like a typecast, which absolutely should be intrusive).

> And human-language wise, colon is "description: explanation"

True enough, but what other syntax would fit in postfix position? In human language we'd probably use commas ("Bob, chef"), but that seems a bit too ambiguous in a programming language.

wffurr · on May 2, 2020

If you're using immediate initialization, leave the type off entirely and let the compiler infer it.

lliamander · on May 2, 2020

I agree, though I could see the merit for a standalone declaration:

  val x: String
  x = "hello"

The type at this point is almost like a comment.

For declaration and assignment though, I agree that reading "ident: Type" is harder for me.

Perhaps an interesting idea would be to have the type at the end of the expression. Like so:

  val x = "hello": String

Essentially, you're making a type assertion on an expression. Since it's an assignment expression (the value of which would be the assigned variable) then it also type checks the variable.

djur · on May 2, 2020

Most statically typed languages don't even need the type assertion in a case like this, though. A literal has a definite type (hopefully), so the type of x can be inferred.

  val x = "hello"

Standalone declarations are the most important problem to solve here.

lliamander · on May 2, 2020

Yeah, type inference is preferred (and pretty common). I'm just saying that, if I had (or wanted to) set the type, at the end of the expression would be my preferred place.

Sharlin · on May 2, 2020

> Perhaps an interesting idea would be to have the type at the end of the expression. Like so:

> val x = "hello": String

Rust has had exactly this (so-called type ascription) for a while in nightly, but it's not stabilized yet due to some unresolved issues.

lliamander · on May 2, 2020

Good to know!

hinkley · on May 2, 2020

But now it's difficult to declare that x is single assignment.

lliamander · on May 2, 2020

Not sure that's true. I'd I say:

  let x = "hello": String
  let x = "world": String

If 'let' is a keyword that implies single assignment, then problem is solved.

Alternatively, using pattern matching alarm Erlang would be even better:

  x = "hello": String //x is bound to "hello"
  x = "hello": String //equality check is successful
  x = "world": String //Error: values not equal

djur · on May 2, 2020

But a colon is also often used to indicate a mapping of names or categories to value. For instance:

  Breakfast: eggs and bacon
  Lunch: falafel sandwich
  Dinner: BBQ pork and slaw

The type of the variable is the "explanation" here.

  val x: String

"x, which is a String"

  val x: String = "hello"

"x, which is a String, is initialized with 'hello'"

hinkley · on May 2, 2020

Your examples have the things on the right and the type on the left, and then used it to describe why examples on the left and [types] on the right makes sense...

If anything you've just argued that

String: x, y

is a preferable declaration style.

djur · on May 2, 2020

The point of the list isn't to say "eggs and bacon is a kind of breakfast", it's to say "breakfast is eggs and bacon". It's "name: details about name", not "type: instance".

htfu · on May 2, 2020

Wouldn’t that argue in favor of

  val breakfast: “eggs and bacon” = String

and not

  val breakfast: String = “eggs and bacon”

djur · on May 4, 2020

My intention was solely to counter the idea that using ":" to mean "is" or "is a" is somehow inconsistent with written English. I don't think it's always necessary or even desirable to match the use of symbols in programming with English orthography, in any case.

The use of a declaration with initialization as the example here muddies the water. Whatever syntax you use has to work for uninitialized declaration of variables, function arguments, and structure members:

  val breakfast: String
  fun serveBreakfast(breakfast: String)
  struct MealPlan {
    breakfast: String
  }

Declaration with initialization just needs to be consistent with these.

dirtydroog · on May 2, 2020

What's the point of the 'val' keyword? It reminds me of Visual Basic's "Dim x As String".

djur · on May 2, 2020

Having a keyword for variable initialization makes parsing easier and less ambiguous. "let", "var", etc. are also used for this purpose, but I was going along with the example.

saagarjha · on May 2, 2020

Immutable, as opposed to var which declares a mutable variable.

RhodesianHunter · on May 2, 2020

val is immutable, var is mutable

joe_the_user · on May 2, 2020

I was going to say the same thing about meaning flow.

But here's an idea. The OP wants meaning to flow left to right for lambda declarations. Why not have assignment go that way too.

    "Hello" => x: string

Might work. Computer science pseudocode uses notation kind of like that.

Also, I understand now why Scala and Rust always seemed clunky to me.

majewsky · on May 2, 2020

> Computer science pseudocode uses notation kind of like that.

Does it? In my time at the university, assignments in pseudocode were usually written as

  x <- "Hello"

nitrobeast · on May 1, 2020

It is presented here that name before type is easier to read as a matter of fact. I’m not so sure. In math or languages where type info is optional, we often write “x = 5”. When type info is required, it is natural to evolve to “int x = 5”. Readers would naturally focus on the latter part. When we write “x: int = 5”, the type info is in the middle. We cannot skip it even when we just want to focus on the name and value.

andrewla · on May 1, 2020

Many languages allow you to elide the type, which is another nice thing about the type following the identifier.

In Scala, in particular, types are not the assigned type like in C (where they also serve as the storage specification) -- they are assertions, that the compiler will check are compatible with the code.

So `val x: int = "hello"` is no good and the compiler can cut it short right there; this is especially useful as call-site documentation.

throwanem · on May 1, 2020

Conversely, a lot of languages will infer type from first assignment, so in e.g. TypeScript "let x = 5", x is inferentially typed as 'number' and the type checker will throw if the implicit constraint is later violated. This reduces the need for explicit type annotations, clearing up a lot of the visual and cognitive noise.

dastbe · on May 1, 2020

There is still a distinction between primitives and object types in Scala, so it's not correct to say that types in Scala aren't used for storage specification.

It's also the case that there are plenty of languages where all types imply storage specification and they support plenty of type elision.

skybrian · on May 2, 2020

If you have local type inference, an alternative would be "x = 5: int".

nyanpasu64 · on May 2, 2020

In Rust, you can write:

    let x = 5i32;

Sadly it's nowhere near as elegant for string literals.

littlestymaar · on May 2, 2020

> Sadly it's nowhere near as elegant for string literals.

What do you mean? For string literals you just do

let foo= "bar";

and that's it.

BTW, in Rust you can omit most variable type annotations since the compiler is able to infer them. You have to give type annotations to functions though.

nyanpasu64 · on May 4, 2020

I meant turning string literals into heap-allocated strings.

littlestymaar · on May 4, 2020

Ah yes, in this case it's indeed a bit more verbose:

let foo= "bar".to_string();

It's on purpose though, because Rust likes to make heap allocations explicit, and I find it fine to be honest, but you mileage may vary.

nyanpasu64 · on May 4, 2020

https://users.rust-lang.org/t/to-string-vs-to-owned-for-stri...

> I’m more fond of using .into(). It requires adding type hints in some cases, but for most cases it is shorter than the alternatives. Especially when passing a string literal to a function that requires String.

> Now that specialization for str::to_string() has landed, we can safely say that to_string() has the same performance as to_owned(), and thus to_string() should be used since it’s more clear

> I now strongly prefer to_owned() for string literals over either of to_string() or into().

yay language complexity

littlestymaar · on May 4, 2020

> yay language complexity

You could argue that Rust strings are complex, because they are : having 8 "string-like" types (owned strings vs string slices (references), cstr/cstring, path and pathbuf, osstr/osString) is complex, but having three different methods doing exactly the same thing isn't.

Yes it's redundant, but what would you rather have : str as the only easily convertible type with no to_string method? Or the only reference type without to_owned? No ability to use into for strings while it works everywhere else? Obviously, redundancy is better than theses alternatives.

_qwfv · on May 1, 2020

Interestingly, I find ident: Type significantly more difficult to read. Having the type information helps me contextualize what I'm about to read -- it narrows the mental search space I need to explore when parsing the name.

For example, knowing something is a float, double, int, or string can make an ident named "releaseTime" mean different things.

I also find that whitespace is more consistent when using Type ident, you get rivers where the spaces all line up, so all the type declarations AND ident declarations align. Whereas with ident: Type, I find it much more difficult because of the variable length of identifiers. (Yes, one could fix this by using tabs, but if idents vary in length by more than one tab stop, it becomes difficult to read horizontally.)

kevmo314 · on May 1, 2020

This feels a little nitpicky/idealistic, I don't think the post does a good job of conveying why it's more beneficial.

> This means that the vertical offset of names stays consistent, regardless of whether a type annotation is present (and how long it is) or not.

Why is this necessarily desirable? Strong typing systems have very expressive types, to the point where if something is typed correctly, most of the time my property names are just an alternative casing of the type. Types can be just as expressive or even more expressive than variable names.

> The i: Int syntax naturally leads to a method syntax where the inputs (parameters) are defined before the output (result type), which in turn leads to more consistency with lambda syntax (whose inputs are also defined before its output).

Maybe this is nice in theory? But `Int` really isn't an output here, and the value being assigned isn't either. Rather this seems more like `f(i, Int, value) -> assignment`. It seems just as arguable that `f(Int, i, value) -> assignment` is appropriate.

It seems like some of these are rooted in a "pure mathematical" approach which I can surely appreciate, but ultimately lambda calculus is as much a language as any other programming language, saying "lambda syntax does it this way" doesn't convince me very much.

echelon · on May 1, 2020

I've been using Rust a lot recently, which puts names before types and inputs before outputs, and I will absolutely attest to how much mental work is saved by ordering things this way. Skimming or reading Rust comes twice as easy as reading Java, and I do a lot of both. Sure it's an anecdotal report, but I have a real sense here that I feel compelled to report.

As other posters have stated, this order makes parsing easier. But I also suggest this benefit extends to your own brain's parsing ability as well. The old order is indirect and suboptimal and makes you think harder.

kevmo314 · on May 1, 2020

Typescript also orders its parameters this way. Between all the Rust and Typescript vs Java and C++ code I've written, I really haven't found that either is better than other, it just seems like a largely arbitrary choice.

Making parsing easier for the compiler is a convincing benefit, would have been nice to see that mentioned in the article. I think that's a substantially stronger reason to prefer types after names. I'm not sure if I parse either faster or slower though.

dntbnmpls · on May 2, 2020

> and I will absolutely attest to how much mental work is saved by ordering things this way.

As you yourself noted, personal anecdotes are really not an argument. Someone could say they find Java easier to skim than Rust and we'd be nowhere. Like arguing which end of a boiled egg to crack first.

> As other posters have stated, this order makes parsing easier.

Programming languages don't exist to make itself easier to parse. They exist to make it easier for programmers to program. Otherwise, we wouldn't have such things like syntactic sugar. Hell we would just write in machine code and do away with assembly and higher level programming language. And parsing is a simple and superficial one time step. Being a tad bit more difficult is not a convincing argument.

> But I also suggest this benefit extends to your own brain's parsing ability as well.

Based on what evidence?

This is the problem with tech evangelism. It has the same problems as religions, lots of claims, no evidence.

kelnos · on May 2, 2020

> As you yourself noted, personal anecdotes are really not an argument.

Then what is? If you're looking for a randomized sampling of programmers with sufficient sample size, you're not going to find it here.

> Programming languages don't exist to make itself easier to parse.

No, but a fine example is that of C++: the difficulty in parsing means that if you make a typo, the error message you get might be bizarre and confusing. A compiler for a language that's easier to parse will have a much better idea of the programmer's intent and can provide a much better error message. I find it astounding how often rustc can figure out exactly what I wanted to do and suggest it as a note after the error message.

I would think that more-useful error messages pass your test of "make it easier for programmers to program".

While we're talking about making it easier to program, "name: Type" make it possible to avoid typing out "Type" at all, and letting the compiler infer it (no, this isn't good and readable to do in all situations, but often it's fine). If you have "Type name" style, and try to add the ability to infer types, you end up with Java's "var" abomination.

Regardless, I'm in agreement: I find "name: Type = blah" much easier to read. I read it as "name is a Type that is equal to blah". This also is an improvement in parameter lists, when they're lined up vertically:

    def foo(bar: String,
            baz: Int,
            quux: Foo)

I find that much easier to mentally parse to determine parameter order than

    void foo(String bar,
             int baz,
             Foo quux)

Worse, imagine that all three parameters were of the same type, requiring a scan to the right to read the names. The important information to me at a glance is the name of the parameter, not its type.

As someone who cut his teeth on C and later Java, much later learning Scala and Rust, I immediately liked the style of the latter two much better. Lately I've been doing a lot of Java and get constantly annoyed at the "backwards" order.

> This is the problem with tech evangelism. It has the same problems as religions, lots of claims, no evidence.

I suppose you could argue that what I've written above is just personal preference, but I see it as a bit stronger than that.

dntbnmpls · on May 2, 2020

> Then what is? If you're looking for a randomized sampling of programmers with sufficient sample size, you're not going to find it here.

Evidence. Maybe a study showing programmers have a natural preference? Or scientific evidence? Anything more convincing than "Rust evangelist" anecdotes.

> No, but a fine example is that of C++: the difficulty in parsing means that if you make a typo, the error message you get might be bizarre and confusing.

Difficulty parsing? If it didn't parse and found an error, then it means it didn't have any difficulty parsing. That has more to do with the complexity of the language itself than parsing. Parsing is a very simple matter. Or maybe the compiler for one language is better? Also, I thought we were comparing Rust to Java?

> I would think that more-useful error messages pass your test of "make it easier for programmers to program".

It does, but once again all you've done is provide anecdotes without any examples or evidence.

> Regardless, I'm in agreement: I find "name: Type = blah" much easier to read.

I don't. The most important part of "name: Type = blah" is the Type. So it's nice to have it first. But then again, there are people who love dynamic programming languages. So once again personal preferences and personal anecdotes aren't convincing arguments.

> As someone who cut his teeth on C and later Java, much later learning Scala and Rust

Yeah, I too fanboy over new languages I learn. But then I get over it and move on with my life. My guess is you just wrote toy programs in scala and rust and nothing substantive.

> I immediately liked the style of the latter two much better. Lately I've been doing a lot of Java and get constantly annoyed at the "backwards" order.

So then use Rust? Why are you using Java?

> I suppose you could argue that what I've written above is just personal preference, but I see it as a bit stronger than that.

I don't have to argue it. All you've provided is personal preference. "I find "name: Type = blah" much easier to read. " is personal preference. It's no more a convincing argument of anything than you prefering chocolate over vanilla shows that chocolate is better than vanilla.

lmm · on May 2, 2020

> Maybe this is nice in theory? But `Int` really isn't an output here, and the value being assigned isn't either. Rather this seems more like `f(i, Int, value) -> assignment`. It seems just as arguable that `f(Int, i, value) -> assignment` is appropriate.

The point is that you want variable declarations and function signatures to be consistent, so you either write

    val i : Int
    def f(x: Int) : String

Or

    Int i
    String f(Int x)

And if you do the latter then you have a confusing syntax because the output type comes before the input type, and it's very hard to do lambdas in a way that looks consistent.

on_and_off · on May 2, 2020

>Why is this necessarily desirable?

Because rythm makes text easier to parse for the human eye

>most of the time my property names are just an alternative casing of the type

String or int are very rarely appropriate variable names

dllthomas · on May 2, 2020

> String or int are very rarely appropriate variable names

I very much agree, but that only undermines the point if they are often appropriate type names. In the sorts of languages the GP was trying to restrict that sentence to, I don't think that's the case. I even have some doubts that it's true in C.

jyounker · on May 2, 2020

I think the author misses the single biggest advantage of `identifier: Type`.

The moment `Type identifier` syntax encounters higher order functions and types, you end up with messes of parenthesis. Figuring out what a type means then involves bouncing back and forth across the type definition.

With `identifier: Type` complex higher order types still parse linearly left to right.

It's enough of a UI issue that people will end up avoiding higher order functions in `Type identifier` languages simply because they're a mess to express.

Too · on May 2, 2020

Yup, especially with structural typing as in Typescript when you don't have aliases to your all your type constraints, having identifier:{complex:mess, of:{nested:stuff}} is easier than other way around.

millimeterman · on May 1, 2020

Language Design: This stuff doesn't matter that much. Focus on more important things.

Syntax isn't unimportant, but don't waste energy on trivial matters like these. Just pick something and people will get used to it. Focus on the semantics of your language - that's what really matters.

Someone · on May 1, 2020

Language design affects how good autocomplete and error messages can be. That is hugely important.

Having said that, this article doesn’t advocate “ident: Type”, it advocates ”marker ident: Type”.

That marker is essential for ease of parsing and thus for autocomplete (it won’t try to autocomplete the ‘ident’ part by looking at variables in scope or function names, for example) and error messages (it could signal when name shadowing occurs, for example)

bsder · on May 1, 2020

Actually, syntax IS important because a language has to be able to be read by both a human AND a computer, nowadays.

One of the things that using ":" does is it makes what is "type" and what is "name" unambiguous to both human and computer.

This is, famously, one of the failings of C/C++. Determining what is a type and what is a name is excruciatingly difficult.

millimeterman · on May 1, 2020

Sure, but there's no arguing to be done there. Is the grammar context free? Ideally, can it be parsed with small constant lookahead? Yes? Cool, no further discussion needed.

I'd be more open to this kind of discussion if there was an ounce of actual research behind what makes syntax more/less readable. As it is, it's just a bunch of people arguing endlessly about their very specific preferences. Just pick something sensible and move on.

mamcx · on May 1, 2020

>Syntax isn't unimportant

#104#101#108#108#111,[Space]world![Space][Space][Tab][Space][Space][Tab][Space][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Space][Tab][Space][Tab][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Tab][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Space][Tab][Tab][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Space][Space][Space][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Tab][Space][Tab][Tab][Tab][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Tab][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Tab][Space][Space][Tab][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Space][Space][LF] [Tab][LF][Space][Space] [Space][Space][Space][Tab][Tab][Space][Space][Tab][Space][Space][LF] [Tab][LF][Space][Space] [LF][LF][LF]

yep, no important at all.

millimeterman · on May 1, 2020

I didn't say "syntax doesn't matter, pick any ridiculous thing you want". That's what I mean by "not unimportant", though I admit it's not exactly clear that's what I meant. My point is that within the space of reasonable, comprehensible syntaxes, there are no demonstrable differences worth arguing about.

mamcx · on May 1, 2020

>there are no demonstrable differences worth arguing about.

That is a big claim. Is very easy to believe (I do it before, when my knowledge of programming languages was about just 3 or 4. Now is more than 12). :

But is clearly false, and is easy to prove:

   async/await
   go chan
   fn sort<T>(of:list<T>...)
   try/catch
   match

All the above are just small things that have a HUGE impact in how develop programs. Also, in matter of "small" stuff that could look insignificant:

   [1, 2, 3] + 1 = [2, 3, 4]

this one is a huge deal in certain niches, also, another "small" and insignificant thing:

    SELECT ... FROM source
    source SELECT ...

All this are just small things. Not all that obvious at the time. Remember how before the times of GOTO the idea of more specialized control flow was unthinkable in the minds of many.

Syntax MATTER MOST. Because, is OUR interface. The space of improvement is not super-big, truth, but it impact hugely.

Also, when done correctly, it make the semantics fit like a glove or not.

Another obvious example: Do concurrency whithout syntax help (just using threads). Or performant, safe, concurrency friendly, zero-gc, system-programming, etc without what rust and other langs have bridged.

millimeterman · on May 1, 2020

I also know many languages (which is hardly some grand accomplishment) and it’s my firm opinion that syntax MATTERS LEAST. You spend some time getting used to it and it never really bothers you again. Semantics matter most - syntax is just an interface to the important stuff.

The difference between Python, C++, Haskell, Common Lisp, Prolog, and SQL isn’t syntax. If it was, everyone would pick their favorite syntax and use it all the time. What matters is how well the semantics (and their potential performance implications) match your problem. The syntax just needs to be a decent enough interface to the semantics. Frankly, it seems to me like most of your “counterexamples” are about language semantics, not syntax.

Here’s the thing. Would I like every language to have a consistent, beautifully designed syntax backed by UX research and testing? Absolutely. But language designers have bigger fish to fry. There’s little value in wasting energy talking about syntax once it reaches a basic state of acceptability.

I do amend my statement - you’re right that it’s a big, unsubstantiated claim. There are no _demonstrated_ differences. I haven’t seen an ounce of evidence that it makes a difference beyond familiarity. Furthermore, even if it did, that wouldn’t make it top priority. It would just make arguments about it sensible.

mamcx · on May 2, 2020

> The difference between Python, C++, Haskell, Common Lisp, Prolog, and SQL isn’t syntax

Ok, let's try: Do SQL without the SQL syntax.

P.D: I don't think we are that in disagreement ("The syntax just needs to be a decent enough interface to the semantics"), is that the claim of "syntax don't matter" make it look is just an irrelevant aspect of the language. Can be argued how much relevant, but after years on this trade, go to the C++ community (for example) and tell them to change the syntax to lisp syntax and see how much it will succeed.

Syntax is 100% tied to paradigms, idioms, and such. Is intrinsic to the language we use.

millimeterman · on May 2, 2020

It's worth reiterating the point of my initial comment (which I admit I may not have conveyed well). I never said "syntax doesn't matter", because that's not my point. My point is that it's almost never worth arguing about. Just pick something (or accept what already exists) and move on. Language designers (and you) have more worthwhile things to do.

My issue is with unproductive, endless debates about syntax minutiae like the original post. Syntax doesn't matter enough to be worth it, and such debates devolve into everyone shouting about their personal preferences anyway (see: many of these comments).

> Do SQL without the SQL syntax

I'm not sure what you're saying here. The syntax of SQL is completely arbitrary - I'm sure you could think of a completely different syntax that works just fine. Let me know if I'm missing something, but it seems extremely obvious to me that the biggest difference between C++ and SQL programs isn't how they look - it's how they behave. One wouldn't dream of replacing one with the other and that has nothing to do with their syntaxes.

> go to the C++ community (for example) and tell them to change the syntax to lisp syntax and see how much it will succeed

Obviously it'll fail - good. Even if lisp syntax was way better, they've gotten used to C++ syntax and have much more important things to spend their time on.

ogoffart · on May 2, 2020

> Ok, let's try: Do SQL without the SQL syntax.

    Select(`my_table`, [`column_A`, `column_B`])
         .Filter(`column_C` > 53 && `column_D` == $varA)
         .Sort_by(`column_C`)

There you go: you have the exact semantic as a traditional SQL query (1:1 mapping) and only the syntax is different. Now, one may argue that the syntax is "ugly", less familiar, that the ` are hard to type or whatever, but this is just taste. One simply get used to it. The expressiveness and semantic are the same as in SQL

> Syntax is 100% tied to paradigms, idioms, and such. Is intrinsic to the language we use.

I think then we don't have the same definition of syntax. The way i understand it is that the syntax is just the way to represent these idioms and paradigms visually. What the parent is saying is that these paradigms and idiom as what is important, but the exact way they are written, not as much (as long as it is within reason)

mamcx · on May 2, 2020

I was to talk about the SQL stuff, but I think it will be wasted as long we get blind to the fact syntax IS semantic.

However this:

> but the exact way they are written, not as much (as long as it is within reason)

Then what is "within reason?". Is more logical to only have GOTO than IF, is better to have ELSEIF or nest IF?, what happened if my lang say that null is the same than Option.None?, what if generics use [] and not <>?.

Whitespace matter, yes? no?

Allow unicode?

CamelCase, snake_case or what? What if all const are lowercase, types mixcase and the rest UPPERCASE?

For some, APL syntax make more sense than algol.

Talk about why, that is the point of this kind of talk.

Is VERY easy to rug this kind of stuff. VERY. I WAS in that camp before. But now, I try to build my own lang (relational), and DAMM, it start to be much clear why syntax matter, even "the exact way they are written", because switch this to that and suddenly, my lang is ANOTHER paradigm (or worse, will be CONFUSED as be).

Naming, is one the hard things in computer science.

---

I understand why is easy to dismmis this as irrelevant. Sometimes I don't see why some people are so upset about typography and font selection, or why my profesional brother complain about framing in photograph. But go and SEE what the DESIGNERS of lang say about this stuff and you will note that for them, even this apparent less-significant thing matter. you can even get a prize on the field for show the importance of syntax (http://www.eecg.toronto.edu/~jzhu/csc326/readings/iverson.pd...)!

If that mean that most will not see, GREAT! That is the mark of good design.

perl4ever · on May 2, 2020

One thing that I've found myself doing a lot after using SQL for a while is writing SQL to produce SQL. Maybe the syntax could be more oriented towards that, which would not be just a matter of taste.

dirtydroog · on May 2, 2020

Found the python programmer!

thom · on May 1, 2020

You'd have to show me some data that one is easier to work with than the other, because fundamentally types _are_ names, and I find them just as expressive as variable names (many of which are just named after types, lets be honest). Even if I didn't, I don't think my brain struggles to read things in either order (or indeed in languages where types are rarely mentioned).

Ar-Curunir · on May 1, 2020

I mean most programming languages put some sort of punctuation between names. Eg: function calls are punctuated by parens, imports are punctuated by `::` or `.`.

The same should happen for types.

thom · on May 1, 2020

I think the vast majority of code that exists in the world today has at most one space between a variable and its type.

Ar-Curunir · on May 2, 2020

There’s a difference between “things that are implemented in popular programming languages” and “things that are implemented in most programming languages”

thom · on May 2, 2020

Yes but you’re clearly making some sort of argument by authority and I think that works both ways. I just don’t see any reason to hold firm beliefs either way on such a trivial issue unless someone has a citation that I’m missing that proves one form increases comprehension or productivity, or reduces bugs.

adamnemecek · on May 1, 2020

It's easier to parse if nothing else.

Ididntdothis · on May 1, 2020

That seems to be the main problem. Otherwise it’s just something to get used to in my view.

adamnemecek · on May 1, 2020

I think it also makes more human sense. The parameter name should in some sense be telling you more than just the type. Like "size: Size" is kind of repetitive.

Ididntdothis · on May 1, 2020

I don't know. Seems programming languages are cryptic no matter what.

adamnemecek · on May 1, 2020

They are not cryptic. They are trying really hard to come up with good syntaxes and semantics actually. I think that modern programming languages tend to have very clean syntaxes.

thom · on May 1, 2020

How would you prove that?

adamnemecek · on May 2, 2020

Try to write a parser for both and compare the results.

thom · on May 3, 2020

Outside of the hellish ambiguities of C and moreso C++ it doesn’t seem like a big deal.

andrewla · on May 1, 2020

Another very nice thing about this is that it is much much easier to parse, because only one kind of thing can go in each position of the phrase. Simplicity in parsing is something that I think is underrated in language design; the harder it is for a computer to parse, the harder it is for a human to parse, and parsing code is 90% of the programmers work (the other parts being 9% debugging and 1% authoring new code).

_qwfv · on May 1, 2020

I replied elsewhere that I found the opposite to be true. So I suspect that different people will find different styles to be easier/harder to parse.

> the harder it is for a computer to parse, the harder it is for a human to parse,

I don't think this is true -- assembly (or bytecode) is very easy for the computer to parse, but much, much harder for humans to parse. English is much easier for humans to parse, but pretty difficult for computers to parse.

qppo · on May 1, 2020

I disagree. The syntax design should flow from the design of the language itself and whether or not you use prefix or postfix notation for type annotations depends heavily on what makes sense within the semantics of the type system.

Design a language before you design a syntax.

throwanem · on May 1, 2020

Granted that the pathological case of the error you warn against is Perl, and that should be enough of a cautionary tale for anyone. But a language is a user interface for programmers, too. Some affordance is merited, especially in a case like this where prefix vs. postfix may affect ease of parsing, but seems most unlikely to influence how the type system actually behaves.

qppo · on May 1, 2020

You're right, I just don't care for the author's notes on language design because they're all on syntax design, which is an impossible task to do in morsels without knowing anything about the rest of the language or how it is supposed to work.

I do prefer postfix because I think it flows very nicely "this is-a thing assigned-to that" is nicer than "thing called this assigned-to that."

In terms of the impact on the language, optional postfix annotation makes it a bit trickier if you want to make the identifier optional, and in languages that support it you tend to see special syntax to deal with that case (which breaks the author's fetish for self consistency).

Personally I think ordering of the trio of "alias" "thing" "value" should be consistent across the language, which extends far past variable assignment, and any one of the trio can be left out.

disconcision · on May 2, 2020

What are examples of language semantics which are better served by pre/postfix type annotations? Also, what exactly do you mean by making the identifier optional?

_7bxa · on May 2, 2020

A benefit of ident: Type is that it allows you to express complex anonymous types.

Example from Typescript:

const foo: 'A' | 'B' | 'C' = 'A'

Which states that foo must belong to the given union type. How would this look in a Type ident language?

const 'A' | 'B' | 'C' foo

That doesn't seem right. There's no clear barrier between the type and the identifier name.

Here's another contrived example:

const foo: () => Promise<void> = async (x) => console.log(x)

Here foo is of type "() => Promise<void>". How would this look in a Type ident language?

const () => Promise<void> foo = async (x) => console.log(x)

To me, this is unclear because it is hard to tell where the type ends and the actual function begins.

Last example.

const foo: { [string]: number} = {"hello": 3}

I believe says that foo is an object with string keys and number values.

What does this look like in a Type ident language?

const { [string]: number} foo = {"hello": 3}

I think all of the Type ident examples are more confusing because it's to tell where the type ends and the name begins (this is most clear in the first example). This probably makes syntax highlighting worse/parsing more complicated/is tougher on the user. With ident: Type, it is very clear that the type starts after the ":" and ends before the "=" sign.

palerdot · on May 1, 2020

Language Design Notes on Rust [0] from the same blog looks interesting too ...

[0] - https://soc.me/languages/notes-on-rust.html

echelon · on May 1, 2020

This is an interesting list.

Some of the things have been addressed (`extern crate`).

Many of the issues I disagree with: `Buf` is strictly better than `Buffer` (less typing, like `fn`). I have no issue with mixing `CamelCase::snake_methods`, and actually find it to be quite beautiful. The good parts of being Pythonic.

I would like to see the alternatives to turbofish. What exactly is the author suggesting? And what's wrong with `println!` and `format!` ? It isn't articulated.

`[]` misuse is bad, semicolons aren't consistent, `PathBuf` is inconsistently named, etc. Agree. `io::Result`, ...

Maybe there will be some cleanup in a future language edition.

pansa2 · on May 1, 2020

> This means that the vertical offset of names stays consistent

This is also an argument for using keywords of the same length for introducing a variable and a constant. If that’s desirable, it rules out the obvious choices `var` and `const`.

Possibilities include `var` and `val`, which may be too similar-looking, and `var` and `let` - but are people used to (from JavaScript) `let` being mutable? Any other options?

Someone · on May 2, 2020

“Let” is mutable in Basic, too, but the part of the population that is used to that is shrinking.

As to short, equal length options for ‘let’ and ‘val’: one could consider using punctuation. Forth uses colons instead of ‘fun’, and I think, in a concise language, one could get used to using, say, ‘!’ for immutable and ‘~’ for mutable. Unfortunately, they aren’t easy to type. An alternative could be to always assume immutability and only use ~ in the rare cases where one needs to mutate.

So, a simple

  foo = 3

or, if one wants to simplify parsing:

  = foo 3

introduces a new immutable variable, and

  ~ foo = 3

or

  ~ foo 3

a mutable one. If we allow leaving out spaces:

  ~foo 3

that starts to look like using sigils to indicate mutable state. I think that might be a good option in a mostly immutable language.

I think I would use Forth’s colon instead of ‘=‘. That would make ‘=‘ available for equality testing, allowing us to get rid of ‘==‘.

Ono-Sendai · on May 2, 2020

One downside of the 'ident: Type' approach is the extra colon character.

The major downside of the 'Type ident' approach, is that if 'Type' is optional, then the parser can't be sure if its parsing the 'Type' or the 'ident' when encountering the first token. In practice this isn't too hard to solve however, it can be handled with some backtracking.

In my language, Winter, I have chosen the 'Type ident', approach, mostly due to similarity with C, C++ and Java. I do sometimes wonder if I made the right choice however. Maybe it could be an option? :)

tlbsofware · on May 1, 2020

I’m surprised this didn’t touch on the IDE autocomplete suggesting variable names. In Java you would have something like `LocationBuilder locationBuilder` which makes users just tab complete the variable name to quickly have access to a variable. The argument in this article was about names being prioritized and I think forcing no auto completion on a variable name would force the developer to be slightly more descriptive than the variablized string of a class name

mr_tristan · on May 1, 2020

In Kotlin, IntelliJ has no problem with this. As you type a new value: `val id`, and you have `IdentName` defined in scope, the value `identName` is suggested automatically.

Not all IDEs are the same, though, and I'm not sure how sophisticated this feature was to implement.

dilap · on May 1, 2020

Maybe a little orthogonal -- I could easily imagine IDEs still doing something like transforming the input ": FooType" into "fooType: FooType" w/ the name selected and ready to be tabbed past.

hocuspocus · on May 1, 2020

JetBrains' autocomplete does pretty well either way, you can be similarly lazy in Scala or Kotlin.

geofft · on May 2, 2020

> The ident: Type syntax let’s developers focus on the name by placing it ahead of its type annotation.

If this were true, we'd have to conclude that speakers of name-then-honorific languages like Japanese ("Graham-san") are better at remembering and focusing on people's names than speakers of honorific-then-name languages like English ("Mr. Graham.")

But there's no evidence of that, is there?

melolife · on May 2, 2020

The most important result of this design is that the syntax unambiguously determines whether you are referencing the type or value axis, and enables you to split them accordingly. Having worked with Scala and been forced to return to a C-style language, this is probably one of Scala's most overlooked features.

js8 · on May 2, 2020

One additional reason why it is beneficial is that you can then naturally extend typing to any expression, not just identifier. This can help the type inference (and also can serve as documentation), which is (IMHO) a must in a modern programming language.

malwarebytess · on May 1, 2020

Seems a solution in search of a problem. Worse I think it creates visual clutter.

IshKebab · on May 1, 2020

Yeah I'm pretty sure the real reason for this is that it is way easier to parse types if they are after the name. Especially complex ones like functions.

dirtydroog · on May 2, 2020

> 1. Names are more important than types

I can't really agree with this at all. With type aliasing the new typename can render the variable name pretty much redundant.

MayorMonty · on May 1, 2020

The first point is the most appealing to my brain at least. Type inference is a really useful feature (when paired with a nice IDE) and having a single standardized prefix to declare variables regardless of what type it is can help the mental model. This is especially true with more complex non-obvious types, where you may not know exactly what type you have without the hint from your environment

scriptproof · on May 2, 2020

If consistency if so important, why do we have: function, func, fun, fn, def, etc... depending the author? For clarity, use "function", for simplicity use "fn", other forms are just fancy.

tharax · on May 2, 2020

If consistency if so important, why do we have: function, func, fun, fn, def, etc... depending the author? For clarity, use "function", for simplicity use "func", other forms are just fancy.

"This is the standard we should all adopt!" https://xkcd.com/927/

gherkinnn · on May 2, 2020

Rob Pyke talks about this in more depth here in the Go blog [0]

0 - https://blog.golang.org/declaration-syntax

strictnein · on May 2, 2020

Sorry, but this article starts off with an excellent example of why this is horrible:

   val x: String = "hello"
   String x = "hello"

The first line reads: "value X is of type String and contains hello"

The second line reads: "String x contains hello"

val and : are fluff and add nothing. Arguments about it being tougher to parse would have some merit if this wasn't all figured out almost 50 years ago.

SamReidHughes · on May 2, 2020

Even better: Use ‘ident Type’.

TOGoS · on May 1, 2020

> The ident: Type syntax let’s developers focus on the name by placing it ahead of its type annotation.

I agree with the sentiment, but that apostrophe is bugging me.