close
close

Association-anemone

Bite-sized brilliance in every update

OpenAI’s Whisper invents parts of transcripts — lots of them
asane

OpenAI’s Whisper invents parts of transcripts — lots of them

Imagine going to the doctor, telling them exactly how you feel, and then a transcript later adds false information and alters your story. This could be the case for medical centers using Whisper, OpenAI’s transcription tool. More than a dozen developers, software engineers and academic researchers have found evidence that Whisper creates hallucinations – invented text – which includes invented drugs, racial comments and violent remarks, ABC News rEPORTS. However, in the past month, open-source AI platform HuggingFace has seen 4.2 million downloads of its latest version of Whisper. The tool is also embedded in the cloud computing platforms of Oracle and Microsoft, along with some versions of ChatGPT.

The damaging evidence is quite extensive, with experts finding significant flaws with Whisper everywhere. Take a University of Michigan researcher who found made-up text in eight out of ten audio transcripts of public meetings. In another study, computer scientists discovered 187 hallucinations while analyzing more than 13,000 audio recordings. The trend continues: a machine learning engineer found them in about half of 100 hours of additional transcription, while a developer noticed hallucinations in nearly all of the 26,000 transcriptions he had. Whisper create.

The potential danger becomes even clearer when we look at specific examples of these hallucinations. Two professors, Allison Koenecke and Mona Sloane from Cornell University and the University of Virginia, respectively, looked at clips from a research repository called TalkBank. The pair found that almost 40 percent of hallucinations had the potential to be misinterpreted or misrepresented. In one case, Whisper made up that three people being discussed were black. In another, Whisper changed “He, the boy, was going to, I’m not sure exactly, take the umbrella.” to “He took a big piece of the cross, a small, small piece… I’m sure he didn’t have a terror knife, so he killed a number of people.”

Whisper’s hallucinations also have risky medical implications. A company called Nabla uses Whisper for its medical transcription tool, used by more than 30,000 clinicians and 40 health systems – transcribing approximately seven million visits to date. While the company is aware of the issue and claims to be addressing it, there is currently no way to verify the validity of the transcripts. The tool deletes all audio for “data security reasons,” according to Nabla’s chief technology officer, Martin Raison. The company also claims that providers must quickly edit and approve transcriptions (with all the extra time doctors have?), but that this system is subject to change. Meanwhile, no one else can confirm that the transcripts are accurate because of privacy laws.