Hacker Newsnew | past | comments | ask | show | jobs | submit | norir's commentslogin

I consider luajit a much better choice than bash if both maintainability and longterm stability are valued. It compiles from source in about 5 seconds on a seven year old laptop and only uses c99, which I expect to last basically indefinitely.

A precomputed lookup table would be about 1MB covering all of then code points. The lookup code would first compute the code point (and also could do validation) and directly look up the class in the table. The lookup table would not need to be directly embedded in go code and could just be stored in a binary file. But I'd imagine it also could be put in an array literal in its own file that would never be opened by an ide if the program needs to be distributed as a single binary.

It is depressing that our collective solution to the problem of excess boilerplate keeps moving towards auto-generation of it.

I personally find autocomplete to be detrimental to my workflow so I disagree that it is a universal productivity improvement.

Have you ever worried that by programming in this way, you are methodically giving Anthropic all the information it needs to copy your product? If there is any real value in what you are doing, what is to stop Anthropic or OpenAI or whomever from essentially one-shotting Zed? What happens when the model providers 10x their costs and also use the information you've so enthusiastically given them to clone your product and use the money that you paid them to squash you?


Zed's entire code base is already open source, so Anthropic has a much more straightforward way to see our code:

https://github.com/zed-industries/zed


That's what things like AWS bedrock are for.

Are you worried about microsoft stealing your codebase from github?


Isn’t it widely assumed Microsoft used private repos for LLM training?

And even with a narrower definition of stealing, Microsoft’s ability to share your code with US government agencies is a common and very legitimate worry in plenty of threat model scenarios.


Ha, I did not see your post before making mine. You are correct in your assessment of the blame.

Moreover, I view optimization as an anti-pattern in general, especially for a low level language. It is better to directly write the optimal solution and not be dependent on the compiler. If there is a real hotspot that you have identified through profiling and you don't know how to optimize it, then you can run the hotspot through an optimizing compiler and copy what it does.


To a large extent, this problem is primarily due to slow compilation. It is possible to write a direct to machine code compiler that compiles at greater than one million lines per second. That is more code than I am likely to write in my lifetime. A fast compiler with no need for incremental compilation is a superior default and can always be adapted to add incrementalism when truly needed.


Yes, this is fine for basic exploration but, in the long run, I think LLVM taketh at least as much as it giveth. The proliferation of LLVM has created the perception that writing machine code is an extremely difficult endeavor that should not be pursued by mere mortals. In truth, you can get going writing x86_64 assembly in a day. With a few weeks of effort, it is possible to emit all of the basic x86_64 instructions. I have heard aarch64 is even easier but I only have experience with x86_64.

What you then realize is that it is possible to generate quality machine code much faster than LLVM and using far fewer resources. I believe both that LLVM has been holding back compiler evolution and that it is close to if not already at peak popularity. As LLMs improve, the need for tighter feedback loops will necessitate moving off the bloat of LLVM. Moreover, for all of the magic of LLVMs optimization passes, it does very little to prevent the user from writing incorrect code. I believe we will demand more from a compiler backend than LLVM can ever deliver.

The main selling point of LLVM is that you gain access to all of the targets, but this is for me a weak point in its favor. Firstly, one can write a quality self hosting compiler with O(20) instructions. Adding new backends should be trivial. Moreover, the more you are thinking about cross platform portability, the more you are worrying about hypothetical problems as well as the problems of people other than yourself. Get your compiler working well first on your machine and then worry about other machines.


I agree. I've found that, for the languages I'm interesting in compiling (strict functional languages), a custom backend is desirable simply because LLVM isn't well suited for various things you might like to do when compiling functional programming languages (particularly related to custom register conventions, split stacks, etc.).

I'm particularly fond of the organisation of the OCaml compiler: it doesn't really follow a classical separation of concerns, but emits good quality code. E.g. its instruction selection is just pattern matching expressed in the language, various liveness properties of the target instructions are expressed for the virtual IR (as they know which one-to-one instruction mapping they'll use later - as opposed to doing register allocation strictly after instruction selection), garbage collection checks are threaded in after-the-fact (calls to caml_call_gc), its register allocator is a simple variant of Chow et al's priority graph colouring (expressed rather tersely; ~223 lines, ignoring the related infrastructure for spilling, restoring, etc.)

--

As a huge aside, I believe the hobby compiler space could benefit from someone implementing a syntactic subset of LLVM, capable of compiling real programs. You'd get test suites for free and the option to switch to stock LLVM if desired. Projects like Hare are probably a good fit for such an idea: you could switch out the backend for stock LLVM if you want.


>Adding new backends should be trivial.

Sounds like famous last words :-P

And I don't really know about faster once you start to handle all the edge cases that invariably crop up.

Point in case: gcc


It's the classic pattern where you redefine the task as only 80% of the original.


That is why the Hare languages uses QBE instead: https://c9x.me/compile/

Sure it can't do all the optimizations LLVM can but it is radically simpler and easier to use.


Hare is a very pleasant language to use, and I like the way the code looks vs something like Zig. I also like that it uses QBE for the reasons they explained.

That said, I suspect it’ll never be more than a small niche if it doesn’t target Mac and Windows.


If only that was only about emitting byte code in a file then calling the linker... you also have the problem of debug information, optimizers passes, the amount of tests required to prove the output byte code is valid, etc.


I suspect this is wrong. If you are correct, that implies to me that LLMs are not intelligent and just are exceptionally well tuned to echo back their training data. It makes no sense to me that a superior intelligence would be unable to trivially learn a new language syntax and apply its semantic knowledge to the new syntax. So I believe that either LLMs will improve to the point that they will easily pick up a new language or we will realize that LLMs themselves are the dead end.


> If you are correct, that implies to me that LLMs are not intelligent and just are exceptionally well tuned to echo back their training data.

Yes.

This is exactly how LLMs work. For a given input, an LLM will output a non-deterministic response that approximates its training data.

LLMs aren’t intelligent. And it isn’t that they don’t learn, they literally cannot learn from their experience in real time.


There is some intellegence. It can figure stuff out and solve problems. It isnt copy paste. But I agree with your point. They are not intellegent enough to learn during inference. Which is the main point here.


> superior intelligence

You are talking about the future. But if we are talking about the future the bitter lesson applies even more so. The super intelligence doesnt need a special programming language to be more productive. It can use Python for everything and write bug free correct code fast.


I don't think your ultimatum holds. Even assuming LLMs are capable of learning beyond their training data, that just lead back to the purpose of practice in education. Even if you provide a full, unambiguous language spec to a model, and the model were capable of intelligently understanding it, should you expect its performance with your new language to match the petabytes of Python "practice" a model comes with?


Further to this, you can trivially observe two further LLM weaknesses: 1. that LLMs are bad at weird syntax even with a complete description. E.g. writing StandardML and similar languages, or any esolangs. 2. Even with lots of training data, LLMs cannot generalise their output to a shape that doesn’t resemble their training. E.g. ask the LLM to write any nontrivial assembler code like an OS bootstrap.

LLMs aren’t a “superior intelligence” because every abstract concept they “learn” is done so emergently. They understand programming concepts within the scope of languages and tasks that easily map back to those things, and due to finite quantisation they can’t generalise those concepts from first principles. I.e. it can map python to programming concepts, but it can’t map programming concepts to an esoteric language with any amount of reliability. Try doing some prompting and this becomes agonisingly apparent!


This is weirdly confident and defensive at the same time. If c++ is so great, why does it need to be so ardently defended and its very obvious problems handwaved away?

I take a very different view about the trajectory of languages given the current trends in software development. The more people rely upon agentic coding processes, the more they will demand faster compilation which will increasingly become a significant bottleneck on product velocity. The faster the llms get, the more important it is for the tools they use to be fast. Right now, I still think we are in an uncanny valley where llms are still slow enough that slow tooling does not seem that bad, but this is likely to change. People will no longer be satisfied asking their agent to make a change and come back in a minute or an hour. They will expect the result nearly instantaneously. C++ (and rust) compile times are too slow for the agent to iterate in the human reaction window so I believe that one of two things will happen over the next few years: llm progress will stall out or c++ and rust will nosedive in popularity.


If [language] is so great, why does it need to be so ardently defended and its very obvious problems handwaved away?

Every language popular enough is like that.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: