Earlier this month, OpenAI wowed the tech world with their live demo of ChatGPT 4o (Omega). While the session began with an overlong and repetitive intro from OpenAI’s CTO Mira Murati, who seemed in search of rehabilitation after her disastrous recent interview with the Wall Street Journal, once the demo began, it was thrilling. Real-time language translation, skillfully guiding a person through a math problem, and the first truly thoughtful and amiable AI were all on display. Steve Jobs would have loved their focus on the user and usefulness.
Basically, if all OpenAI CEO Sam Altman told his team to achieve was something akin to the futuristic AI experience we saw in the 2013 Joaquin Phoenix film, Her, they accomplished it with aplomb.
The demo of a voice-responsive ChatGPT was friendly, personable, and charming, winning over most critics even in the few glitchy moments with the model’s infectious personality. A day later, Google’s stiff and numbers-driven presentation from Google looked outdated and boring by comparison. More power, blah blah. While there were moments of Google’s presentation that rivaled the magical quality of the OpenAI session, Google really didn’t focus on highly desirable use cases in the same way and, thus, it was all way less compelling.
A Scarlett By Any Other Voice
Yet, something struck me immediately — the voice ChatGPT spoke in sounded an awful lot like Scarlett Johansson, the voice of the AI in the previously mentioned film. I remarked on it almost immediately as I watched the demo with my wife and son. I’d put on that Spike Jonez-directed film just a couple of weeks ago as I was thinking about how prescient it was. Her lost none of its power or incisiveness a decade later, as we see the very technology predicted in the film now entering our lives for real.
I wondered aloud if OpenAI had handed Ms. Johansson a massive pot of money to use her lilting, distinctive voice. As it was, Ms. Johansson had been brought in late to Her, replacing another actress who hadn’t quite captured the feel the filmmakers sought. Johansson delivered big time, making it one of cinema’s most compelling audio-only performances, right up there with Douglas Rain from 2001: A Space Odyssey, and maybe just a notch below James Earl Jones as Darth Vader (Geek Cred: Check).
Spending some of those big Microsoft bucks on hiring “ScarJo”, as she’s affectionately referenced by fans, made sense anyway because Sam Altman noted more than once that Her was his favorite film. Despite that seeming like a canned response for any AI exec to offer for that question (how would we feel if he said The Terminator?), this all made sense. I expected the demo team to note the connection somehow, after they were done dazzling us with all the promise that Apple made about Siri ages ago. Remember the clumsy Martin Scorsese commercial with him feeding Siri some narrowly-ranged questions so it looked so much better than it was? ChatGPT 4o is the AI assistant that Siri never was and Alexa only pretends to be.
But ScarJo wasn’t mentioned during the demo, even if her melodious presence was in the room. Surely, OpenAI would do the right thing and hire her for the voice and not use some soundalike actress, right? After all, Bette Midler already won a case like this against the Ford Motor Company after they used a soundalike actress to sing one of his signature songs after she turned them down. That ruling back in the late 1980s may be a key to modern interpretation of this kind of theft of an actor’s distinctive voice and persona.
Isn’t it enough that Johansson had already been the subject of serious disrespect from Disney when her solo Black Widow film had a wonky release during the pandemic that cost her millions. Despite the reasonable excuses of a COVID-era movie release schedule, Disney’s short-lived CEO Bob Chapek paid the price for that bungle (alongside other missteps) with his job. OpenAI surely wouldn’t risk doing the same thing to an actress who is one of the most consistently excellent and likable people in film.
Except that they did. Shortly after the demo, Ms. Johansson released a letter noting that she was approached by Sam Altman to be the voice of ChatGPT nine months ago and she turned them down. She claimed personal reasons for making the decision, and it makes sense. Why give your voice away and be forever in the ears of people as an AI? The voice they hear every morning telling them whatever, the voice they yell at, and the voice that maybe they, like Joaquin Phoenix’s introverted writer in the film, fall in love with?
Those concerns reared their head when Johansson was contacted by friends who knew her, saying that the voice option on the ChatGPT 4o demo sounded an awful lot like her.
Just to make sure the gun was completely engulfed in smoke, OpenAI even asked Ms. Johansson once again if she had changed her mind two days before the big demo. With no answer, OpenAI proceeded to do the demo with the voice called “Sky”, which might as well be called SkyJo since it sounds so much like her. They could have used one of the other ChatGPT voices, but they knew Sky was the one that would help them kill on their demo. And it did.
Denials Rise
Despite Ms. Johansson’s clear letter outlining how Altman pursued a deal with her, and the last minute effort, OpenAI is now denying everything rather than just admitting to making a mistake.
In rather short order, OpenAI agreed to ‘pause’ use of the voice, but they followed up their initial denials with ‘proof’ that their company had not sought to reproduce Ms. Johansson’s voice at all. They called on the Washington Post to look at their evidence and excuses, who promptly issued a rather naive assessment based on the story that OpenAI chose to advance.
First, OpenAI says they hired the ‘actual’ actress before they reached out to Ms. Johansson’s people. Sure, that is possible but that doesn’t mean they didn’t want Johansson instead. OpenAI then trotted out the fact that the “Sky” actress, whose name has been kept quiet for safety, claims she was never directed to sound like Johansson or the character from Her. Furthermore, she said in a statement that nobody ‘who knows her closely’ compares her voice to Johansson’s. Is it a coincidence that she’s used the same expression about ‘knowing her closely’ from Johansson’s public letter? Seems a little convenient.
The company backed that up by saying Joanne Jang, who leads model behavior at OpenAI, doesn’t think the “Sky” voice “sounds at all like Johansson.” Really? She also noted that CTO Mira Murati was in charge of the ‘artistic choices’ about the voices and that Altman had nothing to do with it. Really?
The Trumpian feel of these denials (never admitting wrongdoing, always doubling-down) coupled with the effort to distance Sam Altman from the decision-making feels suspicious to these ears. OpenAI insisting that all the AI voice decisions were made by women at the company falls into the category of ‘methinks the (CEO) doth protest too much.”
Let me repeat: Altman literally reached out to Johansson again two days before the demo. Why would they do this if they didn’t think any of the ChatGPT voices sounded like her? AI moves pretty fast but it’s not like they could suddenly train a model to sound like Johansson in 48 hours if she changed her mind on the spot. Did they, maybe, do just a little bit of training to get the “Sky” voice closer to Ms. Johansson’s? Not sure they denied that option amongst the flurry of excuses.
Crisis Management 101 is Not That Hard
NY Professor and podcaster Scott Galloway often mentions on his podcasts and in his books that there is a sensible approach to effective crisis management that few companies use when they encounter a serious problem. It’s pretty simple:
- Admit to the mistake
- Have the person in charge take responsibility
- Overcorrect on the specific issue
Altman could have said that he wanted Johansson for the part but when she said no, they tried to find a voice that captured the feel without copying her directly. He could have noted that signing Johansson was his dream for all the reasons he mentioned in his appeal to her…but now they see they should have been more cautious about using a voice that people could hear some familiarity in.
The public would have accepted it was a mistake. Then, Altman could have made a commitment that OpenAI would do a better job of respecting the wishes of creators when they declare they are not interested in being part of AI models. The public and the industry would have respected him and appreciated redoubled efforts. Johansson would have had no grounds for a lawsuit since really all she could have done was a cease-and-desist letter that they basically already acted on. We would have moved on.
But they didn’t. Instead, OpenAI is choosing denials longer than the river, well, I don’t think I can make that pun in good conscience. In speaking to a data scientist friend this week, he was surprised that Altman even paused the “Sky” voice with the brazen excuse parade they trotted out. I believe they did so because if Altman wants to continue to hold talks to have Hollywood Studios use the company’s Text-to-Video solution Sora, he’s probably better off not rattling too many people in Tinseltown.
We Need Credibility and Honesty from Big AI
If this is how Big AI is going to treat the industry’s iconic creators, how will they treat the request from social media creators, self-published writers, independent artists, and musicians on Bandcamp who ask them not to train on their work? If a creator files a ‘Do Not Train’ request with the company, will OpenAI respect it? How can we trust them to respect copyright, enforce guidelines on their AI tools, or disclose when they find safety issues with their Large Language Models if they will just barrel ahead to get what they want?
Rather than trust in their better angels, we created Credtent.org. We’re working directly with creators and AI tools to make sure:
- Creators are paid for their content in a subscription model
- AI respects requests for exclusion from training data
- Audits of guardrails are conducted regularly to ensure both safety and copyright protections are in place
Credtent will make sure AI is paying the people whose content powers the incredible value that their products generate and to make sure they aren’t stealing what they were told they cannot have. Our Ethical Sourcing badge system, we will help customers pick the right AI tools to use and help those AI companies who play by the rules market effectively to companies and organizations who care about credible content. As an independent, third-party utility, Credtent’s goal is to help bridge the technology and creative worlds for the better on both sides.
Billionaires may not like the word ‘no’, but if Credtent can orchestrate creators saying the word collectively, we believe we can get them to accept it and make better choices about how construct this transformative technology for the good of humanity, not just certain humans.
Learn more about how Credtent helps you opt your work out of AI or make money from licensing your content.