"Agent 1, refactor that method to be more efficient. Agent 5, tighten up the graphics on level 3!"
I'm not sure its even that, his description of his role in this is:
"You are a Product Manager, and Gas Town is an Idea Compiler. You just make up features, design them, file the implementation plans, and then sling the work around to your polecats and crew. Opus 4.5 can handle any reasonably sized task, so your job is to make tasks for it. That’s it."
And he says he isn't reviewing the code, he lets agents review each others code from look of it. I am interested to see the specs/feature definitions he's giving them, that seems to be one interesting part of his flow.
Yeah maybe the refactoring was a bad example because it implies looking at the code. It's more like "Agent 1, change the color of this widget. Agent 9, add a red dot whenever there's a new message. Agent 67, send out a marketing email blast advertising our new feature."
Assuming both agents are using the same model, what could the reviewer agent add of value to the agent writing the code? It feels like "thinking mode" but with extra steps, and more chance of getting them stuck in a loop trying to overcorrect some small inane detail.
"I implemented a formula for Jeffrey Emanuel’s “Rule of Five”, which is the observation that if you make an LLM review something five times, with different focus areas each time though, it generates superior outcomes and artifacts. So you can take any workflow, cook it with the Rule of Five, and it will make each step get reviewed 4 times (the implementation counts as the first review)."
And I guess more generally, there is a level of non-determinism in there anyway.
Still the same. "Hey look, I got these crappy developers (LLMs) to actually produce working code! This is a game-changer!" When the working code is a very small, limited thing.
I don't know, your talking about an incredibly talented engineer saying:
"In the past week, just prompting, and inspecting the code to provide guidance from time to time, in a few hours I did the following four tasks, in hours instead of weeks"
Its up to you to decide how to behave, but I can't see any reasons to completely dismiss this. It ends with good guidance what to do if you can't replicate though.
The fact that he's an extremely talented developer actually supports my overall understanding that AI producing code is way over hyped. Sure, a master of his field can get a boost out of it, after spending (unaccounted for) time learning how best to coax good code out of it. Neat?
Any sort of evidence! I see none! It's not really a new thing to have no evidence of productivity gains when it comes to software development tools. Some feel like vim is a huge productivity boost and some don't. Some believe rust is amazing, some hate it. It's really hard to measure these things.
Here’s a professional developer who built a product used across the planet daily saying they built a feature for said product, and then asked AI to build the same feature based on the design doc: the AI succeeded and did the same work in minutes: https://antirez.com/news/158
You could always test this yourself. Draw up a design doc for a problem, and implement it. Then ask an AI to do the same thing. Compare your time.
Yeah and its not a big focus of the posts which is interesting. I'd have thought he'd spend a lot more time talking about the workflow he's using, the specs/feature definitions he's writing, and so on.
I would assume there are open source solutions to the problem it could have trained on, if so it would be interesting yo see how they had influenced what Claude produced here.
> Also some people just find it fun to go through their Anki deck instead of doomscrolling while on the subway or waiting in line. Whether there's any real benefit for that person is debatable. It's “fun“ in the same way going to the gym, or drinking kale smoothie is fun.
I'm probably one of those people, but commuting is one of those examples where you have a small (hopefully) amount of relatively low value time, time that is somewhat interrupted. What else of value would you do in it? Maybe listen to a podcast, catch up on blogs. All fine, reasonable choices, but doing a bit of Anki is a reasonable alternative.
Only time I feel like I've wasted those periods is when I end up wasting it (just scrolling through social media or random videos). Anything else is I think a reasonable choice.
I haven't found that at all. I'm well past my twenties but find Anki is one of the things I can fit in, mainly because even with kids and responsibilities you can often find small periods (say 15 mins) of time through the day. It's not enough time to sit down and start into something really complex, especially as the time is sometimes interrupted, but it is enough time to try a few questions.
> If an approach is so boring that you don't do it, then what does it matter how effective it might be?
Yeah true, but an obvious argument is that this is where discipline comes in. If you are one of the people Anki works for, then you have to find the level of discipline required to stick with it.
> Spaced repetition is not meant for conceptual things or skills. It's meant for facts.
> It has little to no relevance in math, physics, and engineering.
That's one bit I disagree with. Engineering is full of facts/concepts, things you often need to know inherently to be able to apply them, or even to know to google them at the right time. So I think SRS can apply there too.
it's doable I guess. I think it's more productive to learn a concept and be able to derive everything about the concept from first principles instead though
Obviously depends on your memory. I found that in the past I read voraciously, and spend a lot of time tinkering. Which was good and fun but I sometimes found I'd forgotten the stuff by the time it would have been useful, particularly when learning about topics I wasn't using day by day. Anki an SRS partially solves that.
It's a trade-off though, I now read less and tinker less. Do I regret that, you bet. But still Anki/SRS works for me, especially because I often do it at times when I wouldn't be able to effectively read/tinker (perhaps tired, or getting kids to sleep). That's a long way of saying, do what's effective for you, but there's no point of being so dismissive of what others are doing.
I'm not sure its even that, his description of his role in this is:
"You are a Product Manager, and Gas Town is an Idea Compiler. You just make up features, design them, file the implementation plans, and then sling the work around to your polecats and crew. Opus 4.5 can handle any reasonably sized task, so your job is to make tasks for it. That’s it."
And he says he isn't reviewing the code, he lets agents review each others code from look of it. I am interested to see the specs/feature definitions he's giving them, that seems to be one interesting part of his flow.
reply