Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a personal anecdote, I had a fairly involved application that built up a context with a lot of custom prompting and created a ~1000 word output. I could run my application over and over again to inspect the results. It was fairly reproducible.

I was having really nice results with the o4-mini model with high thinking. A little while after GPT-5 came out I revisited my application and tried to continue. The o4-mini results were unusable, while the GPT-5 results were similar to what I had before. I'm not sure what happened to the model in those ~4-5 months I set it down, but there was real degradation.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: