Hacker Newsnew | past | comments | ask | show | jobs | submit | sinatra's commentslogin

Piggybacking on this post. Codex is not only finding much higher quality issues, it’s also writing code that usually doesn’t leave quality issues behind. Claude is much faster but it definitely leaves serious quality issues behind.

So much so that now I rely completely on Codex for code reviews and actual coding. I will pick higher quality over speed every day. Please don’t change it, OpenAI team!


Every plan Opus creates in Planning mode gets run through ChatGPT 5.2. It catches at least 3 or 4 serious issues that Claude didn’t think of. It typically takes 2 or 3 back and fourths for Claude to ultimately get it right.

I’m in Claude Code so often (x20 Max) and I’m so comfortable with my environment setup with hooks (for guardrails and context) that I haven’t given Codex a serious shot yet.


The same thing can be said about Opus running through Opus.

It's often not that a different model is better (well, it still has to be a good model). It's that the different chat has a different objective - and will identify different things.


My (admittedly one person's anecdotal) experience has been that when I ask Codex and Claude to make a plan/fix and then ask them both to review it, they both agree that Codex's version is better quality. This is on a 140K LOC codebase with an unreasonable amount of time spent on rules (lint, format, commit, etc), on specifying coding patterns, on documenting per workspace README.md, etc.


That's a fair point and yet I deeply believe Codex is better here. After finishing a big task, I used two fresh instances of Claude and Codex to review it. Codex finds more issues in ~9 out of 10 cases.

While I prefer the way Claude speaks and writes code, there is no doubt that whatever Codex does is more thorough.


Every time Claude Code finishes a task, I plan a full review of its own task with a very detailed plan and it catches itself many things it didn’t see before. It works well and it’s part of the process of refinement. We all know it’s almost never 100% hit of the first try on big chunks of code generated.


How exactly do you plan/initiate a review from the terminal? open up a new shell/instance of claude and initiate the review with fresh context?


It depends on the task but I have different Claude commands that have this role, usually I launch them from the same session. The command has the goal of doing an analysis and generating a md file that I can execute with a specific command and the md as parameter. It works quite well. The generated file is a thorough analysis of hundred of lines with specific coded content. It’s more precise that my few line prompt and help Claude stay on rails


Yeah. It dumps context into various .md files, like TODO.md.


Thanks for the tip. I was dubious, I tried GPT 5.2 for a start on a large plan and it was way better than reviewing it with Claude itself or Gemini. I then used it to help me with feature I was reviewing, it caught real discrepancies between the plan and the actual integration!


This makes me think: are there any "pair-programming" vibecoding tools that would use two different models and have them check each other?


Have you tried telling Claude not to leave serious quality issues behind?


Let’s call it JoyScript so it still shortens to JS. And so at least the name as some joy in it even if the language doesn’t.


Hah. It can’t be “I need to spend more time to figure out how to use these tools better.” It is always “I’m just smarter than other people and have a higher standard.”


Show us your repos.


My stack is React/Express/Drizzle/Postgres/Node/Tailwind. It's built on Hetzner/AWS, which I terraformed with AI.

It's a private repo, and I won't make it open source just to prove it was written with AI, but I'd be happy to share the prompts. You can also visit the site, if you'd like: https://chipscompo.com/



Spot on.


The tools produce mediocre, usually working in the most technical sense of the word, and most developers are pretty shit at writing code that doesn't suck (myself included).

I think it's safe to say that people singularly focused on the business value of software are going to produce acceptable slop with AI.


I currently use GPT‑5.1-Codex High and have a workflow that works well with the 5-hour/weekly limits, credits, et al. If I use GPT‑5.1-Codex-Max Medium or GPT‑5.1-Codex-Max High, how will that compare cost / credits / limits wise to GPT‑5.1-Codex High? I don't think that's clear. "Reduced tokens" makes me think it'll be priced similarly / lower. But, "Max" makes me think it'll be priced higher.


In my AGENTS.md (which CLAUDE.md et al soft link to), I instruct them to "On phase completion, explicitly write that you followed these guidelines." This text always shows up on Codex and very rarely on Claude Code (TBF, Claude Code is showing it more often lately).


I stopped having the same issue of 100s of tabs of "math videos that I was going to watch one day" when I started saving them in my private playlists. Now I just have 100s of videos in playlists that I just look at longingly but never watch.


lol I tried that once.

What works best for me now is to do my best at putting tabs in the correct group tbh most gather while debugging and then I can just kill the group when I'm done.

Problem is the ADHD and groups get contaminated. Mostly a few casualties is actually fine but sometimes the group gets too mixed. Eventually I nuke it all


Have you documented how you built this project using Kiro? Your learnings may help us get the best out of Kiro as we experiment with it for our medium+ size projects.


I've got a longer personal blogpost coming soon!

But in the meantime I'm also the author of the "Learn by Playing" guide in the Kiro docs. It goes step by step through using Kiro on this codebase, in the `challenge` branch. You can see how Kiro performs on a series of tasks starting with light things like basic vibe coding to update an HTML page, then slightly deeper things like fixing some bugs that I deliberately left in the code, then even deeper to a full fledged project to add email verification and password reset across client, server, and infrastructure as code. There is also an intro to using hooks, MCP, and steering files to completely customize the behavior of Kiro.

Guide link here: https://kiro.dev/docs/guides/learn-by-playing/


And the d20 rolled a 12 when you checked it for duration to hold? Man, lucky you! Give the dice a kiss!


Your comment seems unfair to me. We can say the exact same thing for the artist / IP creator:

Tough luck, then. You don’t have the right to shit on and harm everyone else just because you’re a greedy asshole who wants all the money and is unwilling to come up with solutions to problems caused by your business model.

Once the IP is on the internet, you can't complain about a human or a machine learning from it. You made your IP available on the internet. Now, you can't stop humanity benefiting from it.


Talk about victim blaming. That’s not how intellectual property or copyright work. You’re conveniently ignoring all the paywalled and pirated content OpenAI trained on.

https://www.legaldive.com/news/Chabon-OpenAI-class-action-co...

Those authors didn’t “make their IP available on the internet”, did they?


First, “Plaintiffs ACCUSE the generative AI company.” Let’s not assume OpenAI is guilty just yet. Second, assuming OpenAI didn’t access the books illegally, my point still remains. If you write a book, can you really complain about a human (or in my humble opinion, a machine) learning from it?


Usually it’s much easier to be liberal when doing so doesn’t cost you meaningfully. I’d encourage you to evaluate for yourself if your stances are truly fair and if you’re truly liberal considering how painful it is for an H1B to lose their job vs you. It’s also easy to say “but H1Bs get exploited!” Considering how many H1Bs come here, maybe they’d rather face this exploitation vs staying in their own country?


In a free market, there's no such thing as a shortage. This isn't a 1980's soviet grocery store. The market for programmers is not centrally planned. Its one of the least-regulated markets extant today.

So anybody complaining about a "shortage of programmers" is just a cheapskate.

In a free market, what signals to us that more of something should be produced? Buehler? Buehler?


I spent much of 2020 trying to find things like bread in US supermarkets. It's funny how people harken back to Russia 40 years ago as if I was not walking through empty supermarkets four years ago.


There was only a shortage because it wasn't a free market, i.e. nobody wanted to make the dick move of raising the price of bread or toilet paper, because it would cause hardship.

If the prices had been allowed to rise, supply would have equaled demand very quickly, and the shelves would have been stocked as ever. Of course, some people wouldn't have been able to afford them, so we needed some external, non-market mechanism (rationing) to keep prices lower.


There is no such thing in the US as a truly free market.

While rising prices for the toilet paper would have quickly solved the shortage situation it would have elicited the wrath of local and national authorities. And those authorities can make life hell for anyone trying to charge whatever the market will bear.


> There is no such thing in the US as a truly free market.

Just like there are no circles in the US which are exactly 1.234 meters in diameter. Yes, the concept of a "free market" is an ideal, like the concept of a circle is. That doesn't mean that there aren't instantiations of either one which are closer to the ideal than others.

The market for programmers is one of the freest there is. We don't have guilds limiting how many people can be programmers (like, e.g. the American Medical Association does for doctors.) And we don't have unions forcing arbitrary seniority rules, or uniform pay scales.

And government regulation varies from state to state, but most states are "at will" states--you can either quit or be fired at any time for any reason. You don't have to provide any minimum amount of vacation.

The market in programmers is way more free than, say, the market in automobiles or airplanes, where there are all kinds of regulations about safety, etc. But if you can't afford a Ferrari, or a private jet, that doesn't mean there is a Ferrari shortage, or a private jet shortage.

And if you can't afford to pay market wages for programmers, that doesn't mean there is a shortage of programmers either.


Toilet paper ran out because inventories are kept to a bare minimum. Big box stores maintain a one day supply to keep inventory turnover tight. It had nothing to do with manufacturing capacity (Russian example).


Household toilet paper ran out (commercial did not, but its made for very different dispensers) because the supply chains are hyperspecialized and cannot adapt on any reasonable timetable. It absolutely did have to do with manufacturing capacity (otherwise it would have resolved much more quickly), and a rapid demand realignment of where people were using restrooms. The absence of price gouging laws would not have dealt with the fundamental problem, or even with the hoarding response once the supply problems became visible, it would just have shifted which hoarders cleaned out the stocks to the richest rather than merely the fastest, and would have put a lot more money in the hands of sellers.


I didn't claim that the cause of the problem was with manufacturing capacity.

But if the US didn't have implicit price controls ("just try raising prices 3x at this time of national need, you will regret it" from politicians), the deficits would have resolved in a week. My 2c.


Sure, if there were not price controls, the shelves would have been full of toilet paper.....but a large segment of the population wouldn't have been able to afford to buy it.

I don't know if you've ever been so poor that you couldn't buy toilet paper. But I sure have, and let me tell you, sneaking napkins from starbucks, and getting ink all over your hands from using newspapers goes from being inconvenient to being massively depressing real quick.

What kills me are these "sunshine capitalists" who just loooove the free market when they are making money, but who are the fist to cry "shortages!!!" and complain about the market value of engineering talent when it comes to spending money.


Heh, I have grown up not having the toilet paper -- workers paradise, stores carrying mostly the necessities, and luxuries like the toilet paper are only for the few big cities. Using scraps of paper does not kill you. And dental work without Novocaine does not kill you either (although I sure prefer it done with Novocaine now).

But living in this workers paradise I have seen real people suffer from the lack of medication that was available to anyone in the West. The party leadership did not find it necessary (or easy) to produce it locally, so it was only available to those with the right connections. And so on.

I am now a well-off, spoiled American (and the above reads like an O'Henry? story about two rich gentlemen arguing in a restaurant on who had it harder during their youth), but first impressions linger and I will take capitalism over socialism any day. Yes, capitalism has many failings, but replacing the guidance of money with the wise rule of the elite will always lead to a Venezuela-type mismanagement. My 2c.


I did not know you can magically start bread making factories at a whim


The shelves wouldn't be stocked because stores and factories magically appeared; they would be stocked because the price was so high that nobody could afford to hoard bread or toilet paper.


You can if the price is right.


Is this pedantic or pragmatic? A commodity, needed/wanted by all people, used to be available at a price point X, is now unaffordable for a large percentage of its erstwhile consumers is a shortage.

If that commodity satisfies a basic need, its unavilability is just even more fucked up.


Surely, there are many different senses for the word "shortage", so, even if you are pragmatic, its a good idea to be pedantic as to which one you are using.

When I claim a free market has no shortages, I'm using "shortage" in the sense that demand does not exceed supply. "Demand" and "Supply" are also very carefully defined by economists. It's a theorem that under these definitions, in the condition of a free market, there are no shortages.

The market for programmers is certainly not a completely free market, but its close enough that if somebody says they can't find any programmers, it means they are not willing or able to pay market wages for programmers.


>it’s much easier to be liberal when doing so doesn’t cost you meaningfully

I’d go so far as to say that is almost part and parcel what a “liberal” is almost always


That applies across the board, and I suspect is a personality trait independent of political alignment. I've witnessed people on the right who were against handouts or abortion until they were personally impacted.

When there is a real personal cost, a good chunk of people become surprisingly flexible about their politics, or spectacularly fail to resolve the cognitive dissonance and resort to "My circumstances are different."


We all hold beliefs that were never really tested. You never know how strong your principles are until they are treated by circumstances.


I have seen people on the left preach about banning guns until they got into an argument with their neighbor and ran over to my house to borrow a shotgun. I still don't know why I let them have it. I never got it back.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: