I use Claude. It's really good, but you should try to use it as Boris suggests. The other thing I do is give it very careful and precisely worded specs for what you want it to do. I have the habit, born from long experience, of never assuming that junior programmers will know what you want the program to do unless you make it explicit. Claude is the same. LLM code generators are terrific, but they can't second guess unclear communication.
Using carefully written specs, I've found Claude will produce flawless code for quite complex problems. It's magic.
LLMs are parameter based representations of linguistic representations of the world. Relative to robot predictive control problems, they are low dimensional and static. They are batch trained using supervised learning and are not designed to manage real time shifts in the external world or the reward space. They work because they operate in abstract, rule governed spaces like language and mathematics. They are ill suited to predictive control tasks. They are the IBM 360s of AI. Even so, they are astonishing achievements.
LeCun is right to say that continuous self supervised (hierarchical) learning is the next frontier, and that means we need world models. I'm not sure that JEPA is the right tool to get us past that frontier, but at the moment there are not a lot of alternatives on the table.
See, I don't get why people say that the world is somehow more complex than the world of mathematics. I think that is because people don't really understand what mathematics is. A computer game for example is pure mathematics, minus the players, but the players can also be modelled just by their observed digital inputs / outputs.
So the world of mathematics is really the only world model we need. If we can build a self-supervised entity for that world, we can also deal with the real world.
Now, you may have an argument by saying that the "real" world is simpler and more constrained than the mathematical world, and therefore if we focus on what we can do in the real world, we might make progress quicker. That argument I might buy.
> So the world of mathematics is really the only world model we need. If we can build a self-supervised entity for that world, we can also deal with the real world.
In theory I think you are kind of right, in that you can model a lot of real world behaviour using maths, but it's an extremely inefficient lense to view much of the world through.
Consider something like playing catch on a windy day. If you wanted to model that mathematically there is a lot going on: you've got the ball interacting with gravity, fluid dynamics of the ball moving through the air, the changing wind conditions etc. yet this is a very basic task that many humans can do without really thinking about it.
Put more succinctly, there are many things we'd think of as very basic which need very complex maths to approach.
This view of simulation is just wrong and does not correspond at all to human perception.
Firstly, games aren't mathematics. They are low quality models of physics. Mathematics can not say what will happen in reality, mathematics can only describe a model and say what happens in the model. Just mathematics can not say anything about the real world, so a world model just doing mathematics can not say anything about the world either.
Secondly, and far worse for your premise, is that humans do not need these mathematical models. I do not understand the extremely complex mechanical problem of opening a door, to open a door. A world model which tries to understand the world based on mathematics has to. This makes any world model based on mathematics strictly inferior and totally unsuited to the goals.
The world of mathematics is only a language. The (Platonic) concepts go from simple to very complex, but at the base stands a (dynamic and evolving) language.
The real world however is far more complex and perhaps rooted in a universal language, but in one we don’t know (yet) and ultimately try to describe and order by all scientific endeavors combined.
This philosophy is an attempt to point out that you can create worlds from mathematics, but we are far from describing or simulating ‘Our World’ (Platonic concept) in mathematics.
Majored in Philosophy. Started programming in 1973 on mainframes. Became a full time developer, systems analyst. 72 years old now, with 50 years experience in IT. Co-founded a couple of start ups, made a little bit of money. Went back to corporate life for a while. Ended up as a Program Architect at Salesforce. Resigned to start a company which develops and delivers commercial LLM/RAG solutions. Going reasonably well. Simple principles: keep learning, do what you want to do, not just what the man tells you. I saw a note from another Philosophy grad saying that Philosophy is actually useful in that it gives you a framework and a perspective to look at things a little differently. I agree with that.
Could you give an example of how it helps you look at things a little differently?
I found that when I talked to my old roommate (he has a philosophy bachelor + master degree) that my programming experience helped a lot in talking about philosphy.
I've spent a lot of my career doing various types of solution design. One of the insights I gained from thinking a lot about representation, intentionality and the philosophy of language is that the way you represent a problem has a big influence on how easy you will find it to solve the problem. I've found that helps with solution design. Don't just think about the problem. Think about what is the best way to represent the problem.
Also a philosophy grad. Good philosophy programs force you to practice aggressively thinking logically, to the point of teaching symbolic logic which is basically coding.
I'm in the process of actually building LLM based apps at the moment, and Martin Fowler's comments are on the money. The fact is seemingly insignificant changes to prompts can yield dramatically different outcomes, and the odd new outcomes have all these unpredictable downstream impacts. After working with deterministic systems most of my career it requires a different mindset.
It's also a huge barrier to adoption by mainstream businesses, which are used to working to unambiguous business rules. If it's tricky for us developers it's even more frustrating to end users. Very often they end up just saying, f* it, this is too hard.
I also use LLM's to write code and for that they are a huge productivity boon. Just remember to test! But I'm noticing that use of LLM's in mainstream business applications lags the hype quite a bit. They are touted as panaceas, but like any IT technology they are tricky to implement. People always underestimate the effort necessary to get a real return, even with deterministic apps. With indeterministic apps it's an even bigger problem.
Some failure modes can be annoying to test for. For example, if you exceed the model’s context window, nothing will happen in terms of errors or exceptions but the observable performance on the task will tank.
Counting tokens is the only reliable defence i found to this.
If you exceed the context window the remote LLM endpoint will throw you an error which you probably want to catch, or rather you want to catch that before it happens and deal with it. Either way, it's not a silent error that goes unnoticed usually, what makes you think that?
> If you exceed the context window the remote LLM endpoint will throw you an error which you probably want to catch
Not every endpoint works the same way, I'm pretty sure LM Studio's OpenAI-compatible endpoints will silently (from the clients perspective) truncate the context, rather than throw an error. It's up to the client to make sure the context fits in those cases.
OpenAI's own endpoints do show an error and refuses if you exceed the context length though. I think I've seen others use the "finish_reason" attribute too to signal the context length was exceeded, rather than setting an error status code on the response.
Overall, even "OpenAI-compatible" endpoints often aren't 100% faithful reproductions of the OpenAI endpoints, sadly.
That seems like terrible API design to just truncate without telling the caller. Anthropic, Google and OpenAI all will fail very loudly if you exceed the context window, and that's how it should be. But fair enough, this shouldn't happen anyway and the context should be actively handled before it blows up either way.
> That seems like terrible API design to just truncate without telling the caller
Agree, confused me a lot the first time I encountered it.
It would be great if implementations/endpoints could converge, but with OpenAI moving to the Responses API rather than ChatCompletion, yet the rest of the ecosystem seemingly still implementing ChatCompletion with various small differences (like how to do structured outputs), it feels like it's getting further away, not closer...
It's complicated, for example some models (o3) will throw an error if you set temperature.
What do you do if you want to support multiple models in your LLM gateway? Do you throw an error if a user sets temperature for o3, thus dumping the problem on them? Or just ignore it, but potentially creating confusion because temperature will seem to not work for some models?
Me to, and I'm always battling with the LLM's obsession with lazily writing reams of ridiculously defensive code and masking errors in the code it generates and calls, instead of finding the root cause and solving that.
(Yes, I'm referring to the code LLMs generate, not the API for generating code itself, but "fail early and spectacularly" should apply to all code and apis.)
But you have to draw the line at failures that happen in the real world, or in code you can't control. I'm a huge fan of Dave Ackley's "Robust First" computing architecture, and his Moveable Feast Machine.
His "Robust First" philosophy is extremely relevant and has a lot of applications to programming with LLMs, not just hardware design.
Robust First | A conversation with Dave Ackley (T2 Tile Project) | Functionally Imperative Podcast
DonHopkins on Oct 26, 2017 | parent | favorite | on: Cryptography with Cellular Automata (1985) [pdf]
A "Moveable Feast Machine" is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture. It's similar to a Cellular Automata, but it different in several important ways, for the sake of "Robust First Computing". These differences give some insight into what CA really are, and what their limitations are.
Cellular Automata are synchronous and deterministic, and can only modify the current cell: all cells are evaluated at once (so the evaluation order doesn't matter), so it's necessary to double buffer the "before" and "after" cells, and the rule can only change the value of the current (center) cell. Moveable Feast Machines are like asynchronous non-deterministic cellular automata with large windows that can modify adjacent cells.
Here's a great example with an amazing demo and explanation, and some stuff I posted about it earlier:
Interesting, the completion return object is documented but theres no error or exception field. In practice the only errors ive seen so far have been on the HTTP transport layer.
It would make sense to me for the chat context to raise an exception. Maybe i should read the docs further…
I knew a tech founder once who spent an hour lecturing us (his employees) on business ethics. He'd even written a little red book (like Mao) to codify his thoughts on how we should all behave.
Fast forward a few years and the guy flees the US after being charged with securities fraud. Spends the rest of his life living on his millions in a foreign country with no US extradition treaty.
Look, I love LLMs and even implement them for customers, but I am very sceptical about them 'replacing' ERP and CRP systems. What some AI folks don't seem to understand is that traditional ERP and CRP apps are completely driven by auditable business rules because they have to be. If you're running a company, there's no discretion at all about how money and other assets and liabilities are accounted for. It all has to be strictly according to the rules. This goes for most everything else - management are responsible for the business rules implemented in the system and they need to be precisely spelled out. Sure, AI can and should be used extensively for the human UI piece of it. To simplify getting data into and out of the system for example. But the engine inside and the database are all strictly rule governed and I definitely dont expect that to change anytime soon.
I'm surprised not to see anything about David Cope, who is the real master of AI produced music. Cope works mostly within the classical tradition, using AI to produce work in the style of great composers like Bach, Mozart etc. To my ear, it lacks the brilliance of the originals, but some of it is quite good. Certainly several orders of magnitude better than Suno's AI slop.
Gettier cases tell us something interesting about truth and knowledge. This is that a factual claim should depict the event that was the effective cause of the claim being made. Depiction is a picturing relationship: a correspondence between the words and a possible event (eg a cow in a field). Knowledge is when the depicted event was the effective cause of the belief. Since the paper mache cow was the cause of the belief, not a real cow, our intuitions tell us this is not normal knowledge. Therefore, true statements must have both a causal and depictional relationship with something in the world. Put another way, true statements implicitly describe a part of their own causal history.
Mathematicians already explored exactly what you describe: this is the difference between classical logic and intuitionistic logic:
In classical logic statements can be true in and of themselves even if there as no proof of it, but in intuitionistic logic statements are true only if there is a proof of it: the proof is the cause for the statement to be true.
In intuitionistic logic, things are not as simple as "either there is a cow in the field, or there is none" because as you said, for the knowledge of "a cow is in the field" to be true, you need a proof of it. It brings lots of nuance, for example "there isn't no cow in the field" is a weaker knowledge than "there is a cow in the field".
It is a fascinating topic. I spent a few hours on it once. I remember vaguely that the logic is very configurable and you had a lot of choices. Like you choose law of excluded middle or not I think, and things like that depending on your taste or problem. I might be wrong it was 8 years ago and I spent a couple of weeks reading about it.
Also no suprise the rabbit hole came from Haskell where those types (huh) are attracted to this more.foundational theory of computation.
I think you are asking the wrong question. Before you decide what to bring in house, you need to have an IT strategy That sets out what components of your desired solution should be built, and what should be bought. You need to have a view about which technologies you will use, and why. What is your end state IT architecture? Have you prioritised your objectives and requirements?
I understand your current systems are holding you back and there may be no appetite to get a bunch of expensive consultants in to answer all these questions. But just hiring a bunch of devs without a plan for what you want them to do is a recipe for disaster.
I know Orwell's work pretty well, and I read that sentence, and thought to myself: "Cant remember where he said anything like that, but what the hell, I haven't read everything Orwell ever wrote". So I just rolled with it.
The cognitive load to fact check everything is too great, so we decide which sources we think are reliable and just accept them. The solution is not to disbelieve everything you are told, but to accept that some of the facts you have not checked might be wrong, and be prepared to re-evaluate when contrary evidence appears.
Using carefully written specs, I've found Claude will produce flawless code for quite complex problems. It's magic.