My 2 centes: 1. Learn basic NNs at a simple level, build from scratch (no framew...

gpjt · 2025-05-24T13:38:12 1748093892

This, 100%. A full-stack engineer will likely have at least a solid understanding of the HTTP protocol, HTTPS, WebSockets, the interface layer between the frontend server and their chosen Web webdev stack, and so on. Then a more general understanding of networking protocols, TCP vs UDP, DNS, routing, etc. In general, you need to have a solid understanding of the layer below where you're working, some understanding of the layer below that, and so on, less and less detail needed for each layer down.

(That's not to say that you shouldn't bother with learning more -- more knowledge is always good -- or that the OP specifically only knows that. It's more a sensible minimum.)

My own "curriculum" for that has been Jeremy Howard's Fast AI course and Sebastian Raschka's book "build an LLM from scratch". Still working through it, but once I'm done I think I'll be solid on your point 2 above. My guess is that I'll want to learn more, but that's out of interest more than because I think its necessary.

losvedir · 2025-05-24T13:21:56 1748092916

As someone who I both respect a lot and know is really knowledgeable about the latest with AI and LLMs: can you clarify one thing for me? Are all these points based on preparing for a future where LLMs are even better? Or do you think they're good enough now that they will transform the way software is built and software engineers work, with just better tooling?

I've tried to keep up with them somewhat, and dabble with Claude Code and have personal subscriptions to Gemini and ChatGPT as well. They're impressive and almost magical, but I can't help but feel they're not quite there yet. My company is making a big AI push, as are so many companies, and it feels like no one wants to be "left behind" when they "really take off". Or is that people think what we have is already enough for the revolution?

antirez · 2025-05-24T14:55:57 1748098557

I think that LLMs already changed the way we code, mostly, but I believe that agentic coding (vibe coding) is right now able to produce only bad results, and that the better approach is to use LLMs only to augment the programmer work (however it should be noted that I'm all for vibe coding for people that can't code, or that can't find the right motivation. I just believe that the excellence in the field is human+LLM). So failing to learn LLMs right now is yet not catastrophic, but creates a disadvantage because certain things become more explorable / faster with the help of 200 yet-not-so-smart PHDs in all the human disciplines. However other than that, there is the fact that this is the biggest technology emerging to date, so I can't find a good reason for not learning it.

apwell23 · 2025-05-24T10:46:52 1748083612

> Learn how to use AI effectively for coding. This is absolutely non-trivial, and a lot of good programmers are terrible LLMs users (and end believing LLMs are not useful for coding).

I've been asking this on every AI coding thread. Are there good youtube videos of ppl using AI on complex codebases. I see tons of build tic-tac-to in 5 minutes type videos but not on bigger established codebases.

antirez · 2025-05-24T10:55:28 1748084128

You may want to check my channel perhaps. There are videos of coding with LLMs doing real world things. Just search for "Salvatore Sanfilippo" on YouTube. The videos about coding+LLM are mostly in English.

becquerel · 2025-05-24T15:17:55 1748099875

IIRC the guy who makes Aider (Paul Gauthier) has some videos along these lines, of him working on Aider while using Aider (how meta).

manmal · 2025-05-24T09:29:17 1748078957

My problem with 5. is that there are many unknowns, especially when it comes to agents. They have wildly different system prompts that are optimized on a daily basis. I’ve noticed that Gemini 2.5 Pro seems way dumber when used in the Copilot agent, vs me just running all the required context through OpenRouter in Continue.dev. The former doesn’t produce usable iOS tests, while the latter was almost perfect. On the surface, it looks like those should be doing the same thing; but internally, it seems that they are not. And I guess that means I should just use Continue, but they broke something and my workflow doesn’t work anymore.

And people keep saying you need to make a plan first, and then let the agent implement it. Well I did, and had a few markdown files that described the task well. But Copilot‘s Agent didn’t manage to write this Swift code in a way that actually works - everything was subtly off and wrong, and untangling would have taken longer than rewriting it.

Is Copilot just bad, and I need to use Claude Code and/or Cursor?

diggan · 2025-05-24T11:11:38 1748085098

> Is Copilot just bad, and I need to use Claude Code and/or Cursor?

I haven't used Claude Code much, so cannot really speak of it. But Copilot and Cursor tends to make me waste more time than I get out of it. Aider running locally with a mix-and-match of models depending on the problem (lots of DeepSeek Reasoner/Chat since it's so cheap), and Codex, are both miles ahead of at least Copilot and Cursor.

Also, most of these things seems to run with temperature>0.0, so doing multiple runs, even better with multiple different models, tend to give you better results. My own homegrow agent that runs Aider multiple times with a combination of models tend to give me a list of things to chose between, then I either straight up merge the best one, or iterate on the best one sometimes inspired by the others.

antirez · 2025-05-24T09:40:38 1748079638

I never ever use agents for coding. Just the web interface of Gemini, Claude, ..., you are perfectly right that agentic coding just creates a layer of indetermination and chaos.

prohobo · 2025-05-24T08:27:18 1748075238

Agreed with most of this except the last point. You are never going to make a foundational model, although you may contribute to one. Those foundational models are the product, yes, but if I could use an analogy: foundational models are like the state of the art 3D renderers in games. You still need to build the game. Some 3D renderers are used/licensed for many games.

Even the basic chat UI is a structure built around a foundational model; the model itself has no capability to maintain a chat thread. The model takes context and outputs a response, every time.

For more complex processes, you need to carefully curate what context to give the model and when. There are many applications where you can say "oh, chatgpt can analyze your business data and tell you how to optimize different processes", but good luck actually doing that. That requires complex prompts and sequences of LLM calls (or other ML models), mixed with well-defined tools that enable the AI to return a useful result.

This forms the basis of AI engineering - which is different from developing AI models - and this is what most software engineers will be doing in the next 5-10 years. This isn't some kind of hype that will die down as soon as the money gets spent, a la crypto. People will create agents that automate many processes, even within software development itself. This kind of utility is a no-brainer for anyone running a business, and hits deeply in consumer markets as well. Much of what OpenAI is currently working on is building agents around their own models to break into consumer markets.

I recommend anyone interested in this to read this book: https://www.amazon.com/AI-Engineering-Building-Applications-...

antirez · 2025-05-24T09:13:03 1748077983

I agree that instrumenting the model is useful in many contexts, but I don't believe it is something so unique to value Cursor such valuation, or all the attention RAG, memory, MCP get. If people say LLMs are going to be commodities (we will see) imagine the layer about RAG, tool usage, memory...

The progresses we are seeing in agents are 99% due to new LLMs being semantically more powerful.

mafro · 2025-05-24T10:58:59 1748084339

Thanks for this breakdown, I guess I'm largely in the window of points 3-6.

Any suggestion on where to start with point 1? (Also a SWE).

mikedelfino · 2025-05-24T11:03:14 1748084594

Thank you for sharing. Do you recommend any courses or books for following that path?

namnnumbr · 2025-05-24T13:26:37 1748093197

For SWEs interested in "AI Engineering" (either getting involved in how models work, or building applications on them), there's a critical paradigm shift in that using "AI" requires more of an experimental mindset than software engineering typically does.

- I strongly recommend Chip Huyen's books ("Designing Machine Learning Systems" and "AI Engineering") and blog (https://huyenchip.com/blog/).

- Andreessen Horowitz' "AI Cannon" is a good reference listicle (https://a16z.com/ai-canon/)

- "12 factor agents" (https://github.com/humanlayer/12-factor-agents)