I've been underwhelmed with dedicated tools like Windsurf and Cursor in the sense that they are usually more annoying than just using ChatGPT. They have their niche but they are just so incredibly flow destroying it is hard to use them for long periods of time.
I just started using Codex casually a few days ago though and already have 3 PRs. While different tools for different purposes make sense, Codex's fully async nature is so much nicer. It does simple things like improve consistency and make small improvements quite well which is really nice. Finally we have something that operates more like an appliance for a certain classes of problems. Previously it felt more like a teenager with a learners license.
Have you tried Claude code? I’m surprised it’s not in this analysis but in my personal experience, the competition doesn’t even touch it. I’ve tried them all in earnest. My toolkit has been (neo)vim and tmux for at least a decade now so I understand the apprehension for less terminal-inclined folks that prefer other stuff but it’s my jam and just crushes it.
Right, after the Sonnet 4 release it was the first time I could tell an agent something and just let it run comfortably. As for the tool itself, I think a large part of its ability comes from how it writes recursive todo-lists for itself, which are shown to the user, so you can intervene early on the occasions it goes full Monkey's Paw.
OpenAI nailed the UX/DX with codex. This completely obsoletes cursor and similar IDEs. I don't need AI in my tools. I just need somebody to work on my code in parallel to me. I'm happy to interact via pull requests and branches.
I found out that I have access to codex on Thursday with my plus subscription. I've created and merged about a dozen PRs with it on my OSS projects since then. It's not flawless but it's pretty good. I've done some tedious work that I had been deferring, got it to complete a few FIXMEs that I hadn't gotten around to fixing, made it write some API documentation, got it to update a README, etc. It's pretty easy to review the PRs.
What I like is that it creates and works on its own branch. I can actually check that branch out, fix a few things myself, push it and then get it to do PRs against that branch. I had to fix a few small compilation issues. In one case, the fix was just removing a single import that it somehow got wrong after that everything built and the tests passed. Overall it's pretty impressive. Very usable.
I wonder how it performs on larger code bases. I expect some issues there. I'm going to give that a try next.
I think there are basically three kinds of uses for AI: 1) "Out of loop" - e.g. Codex - it does things while you work on something else. Today it can handle basic things on its own like an appliance. 2) "In the loop" - e.g. Windsurf / Cursor. Here, you know what you are doing but are trying to use AI to essentially type at super human speeds. 3) "Coach mode" - you need to learn something in order to progress. You are using ChatGPT (usually), but possibly other tools as a way to help you get the right context faster.
Of these "In the loop", seems to be the one that doesn't work that well (yet). The main problem is latency in my opinion.
In the loop is not really a problem I have. I use intellij. So, I'm usually not really limited by my ability to type fast. I don't actually type a lot of code mostly.
A better auto complete than comes with the IDE already is actually hard and most of the AI code completion approaches I've seen conflict with the built in auto complete and don't actually don't do better. I've tried a few things and usually end up disabling the auto complete features they offer because they are quite pointless for me. What happens is that I get a lot of suggestions for code I definitely don't want drowning out the completions I do want and messing up my editing flow. Aside from having to constantly read through code that is definitely a combination of not what I'm looking for and probably wrong. And it is actually extra work that I don't need in my life. A bit of an anti feature as far as I'm concerned.
But, I actually have been using chat gpt quite a bit. It works for me because it connects to the IDE (instead of interfering with it) and it allows me to easily prompt it to ask questions about my code. This is much more useful to me than an AI second guessing me on every keystroke.
Codex adds to this by being more like a team mate that I can delegate simple things to. It would be nice if it could notify me when it is done or when it needs my input. But otherwise it's nice.
I'm pretty sure the codex and chat gpt desktop UIs might merge soon. There's no good reason to have two modalities here other than that they are probably created by two different teams. Conway's law might be an issue here. But I like what OpenAI has done with their desktop client though and they seem to be on top of that.
I love using codex to just explore code instead of searching. It’s a great tool to learn or research what’s happening in the code with great code breadcrumbs to find what you need to know
I just started using Codex casually a few days ago though and already have 3 PRs. While different tools for different purposes make sense, Codex's fully async nature is so much nicer. It does simple things like improve consistency and make small improvements quite well which is really nice. Finally we have something that operates more like an appliance for a certain classes of problems. Previously it felt more like a teenager with a learners license.