It's the following that is problematic: "I asked each of them to fix the error, specifying that I wanted completed code only, without commentary."
GPT-5 has been trained to adhere to instructions more strictly than GPT-4. If it is given nonsense or contradictory instructions, it is a known issue that it will produce unereliable results.
A more realistic scenario would have been for him to have requested a plan or proposal as to how the model might fix the problem.
This is done so that application developers whose systems depend upon specific model snapshots don't have to worry about unexpected changes in behaviour.
You can access these snapshots through OpenRouter too, I believe.
Use review bots (CodeRabbit, Sourcery and Codescene together work for me). This is for my own projects outside of work, of course. I use Terragon for this. 10 parallel rust builds would kill my computer. Got a threadripper on its way through, so superset sounds like something I need to give a go.
Well apparently 3 years later they did a thing. I asked about it so many times I didn't even bother to check if they added it.
Though I'm not sure if they did not sneak it as some part of AB-test because the last time I did check was in october and I'm pretty sure it was not there.
reply