OpenAI just announced a major upgrade to its flagship product. Judging from demo videos, GPT-4o ('o' for omni) will bring unprecedented levels of natural conversation to ChatGPT. While previous versions of the app's voice mode featured a noticeable lag, the new model takes an average of 320 milliseconds to respond to audio inputs, approaching the speed of real-time human conversation. It's also able to react to video input, express emotion, crack jokes and interpret tone and sentiment in a speaker's voice.
As OpenAI CEO Sam Altman notes in a blog post following the announcement, "... the new voice (and video) mode is the best computer interface I've ever used. It feels like AI from the movies; and it's still a bit surprising to me that it's real. Getting to human-level response times and expressiveness turns out to be a big change. The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different."
Hardly a coincidence, then, that the female voice in OpenAI's demos bears a striking resemblance to Scarlett Johansson...