Google’s new GenAI podcast tool is exciting but how useful is it for business?

The real disruption potential for this technology lies not in replacing podcasters, but in adding a new way to assimilate information.
The real disruption potential for this technology lies not in replacing podcasters, but in adding a new way to assimilate information.

Summary

  • A feature added by an online research tool that Alphabet released last year can now turn documents into an eerily human-sounding podcast. With the value of GenAI being questioned, Google must find ways to make this tool useful for business users.

Google is having its own ChatGPT moment. Technologists, scientists and OpenAI founder Sam Altman have been praising a feature added by NotebookLM, a free online research tool that Alphabet released last year. 

Uploading documents to the site allows users to answer questions about their content or synthesize it into summaries, briefing notes and more. Now it can also turn that content into an eerily human-sounding podcast. 

The male and female AI-generated hosts not only have sonorous, FM-radio voices but punctuate their conversations with ums, pauses and catchy phrases like “get this." The banter sounds so seamless that you’d be forgiven for thinking the conversation was between people.

I’ve used the tool to generate a 15-minute podcast about a 208-page presentation, which would have taken an hour or more to read, while others have used it to generate deep dives into research papers or their own diaries. NotebookLM has inspired a burst of viral experimentation similar to the kind that first met ChatGPT.

Also read: Who’s making money from GenAI? Big Tech, consultants or data centres?

The system runs on Google’s flagship AI model Gemini 1.5, which also powers the ‘AI overviews’ that are now replacing the top results of many Google searches; but it also has its own secret sauce to make the voices sound so human. 

“There’s some new audio technology in there that is, I don’t think, fully public," Steven Johnson, Google’s editorial director of NotebookLM, tells me. “It’s the most realistic conversation that a computer has ever generated." He added that there had been a “huge spike" in NotebookLM’s usage since it added the podcast-generator.

Commentators have called the feature mind-blowing, while Andrej Karpathy, a co-founder of OpenAI and former head of AI at Tesla, said it was “now my favourite podcast." Presumably, this is how Karpathy consumes much of his content now. 

That indeed may be where the real disruption potential for this technology lies; not in replacing podcasters, but in adding a new way to assimilate information. Wireless earbud shipments will grow 11% this year and 16% in 2025, according to market-research firm Canalys.

My own take: The voices are extraordinary and display a level of realism above any other AI-generated audio I’ve heard before. But the user interface for NotebookLM is infuriating to navigate, and after listening to several of its AI podcasts, I also found it difficult to pay full attention to some of the conversations. 

Perhaps there’s an intangible connection that humans have through voice that naturally keeps us attentive. During my early years in radio, a veteran told me that the secret to great news reading wasn’t any vocal inflection, but to simply pay attention to what you were reading. For some reason, listeners found themselves more engaged. It’s hard to see how a computer could replicate that.

Also read: Errors, high cost among reasons GenAI not moving beyond concept stage

The bigger question for Google is whether it will turn its magical feature into something useful for business. The company has a history of failing to execute on its own innovations. 

Its researchers, for instance, famously invented a key algorithm called the Transformer—the T in ChatGPT—but OpenAI capitalized on the tech. 

Perhaps we should expect as much from a conglomerate cobbled together by acquisitions like DeepMind, Android, YouTube and DoubleClick, and which has been hamstrung by the innovator’s dilemma: Make AI searches too good and Google risks cannibalizing its lucrative search business.

The ‘wow factor’ in AI can also lead to hype and overspending, which means investors should be cautious about novel hits. Wall Street is already becoming wary of the gap between the awe-inspiring experiences people first had with ChatGPT and generative AI’s business utility.

Google will eventually add other voices to its podcast generator, and Johnson tells me the company will eventually sell a premium version, including one aimed at businesses. 

In that sense, the audio overviews may simply act as a neat marketing trick for NotebookLM, whose utility is far more obvious: a straightforward tool for using Google’s AI model on your own documents and data. 

That fine-tuning process, known as RAG (or Retrieval-Augmented Generation) in the industry is typically more costly and complex when carried out as part of an official subscription to Google’s Gemini or other AI models.

Also read: Showbiz curve ball: Generative AI can turn filmmaking inside out

If lifelike AI voices get more people using NotebookLM and Gemini, Google will have turned its magic into revenue. But businesses are grappling with the true return on investment of GenAI, and one of the field’s big sceptics, Daron Acemoglu, just won a Nobel prize for economics, lending credibility to looming questions about AI’s real utility. For Google, that spells an uphill battle. ©bloomberg

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

topics

MINT SPECIALS