Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just need their API to be faster. 15-30 seconds per request using 4o-mini isn't good enough for responsive applications.


You should try Azure: it comes with dedicated capacity which is typically a very expensive "call our sales team" feature with OpenAI


The new Realtime Websocket API appears to send back responses within less than a second. It might be just what you want.


yes and you can use it in text-text mode if you want. a key benefit is for turn-based usages (where you have running back and forth between user and assistant) you only need to send the incremental new input message for each generation. this is better than "prompt caching" on the chat completions API, which is basically a pricing optimization, as it's actually a technical advantage that uses less upstream bandwidth.


That is odd. Longest I’ve experienced in my use of it is a few seconds.


That doesn’t match my experience using it a lot at all




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: