Summarizing Books as Podcasts – O’Reilly

Like nearly everybody, we had been impressed by the flexibility of NotebookLM to generate podcasts: Two digital individuals holding a dialogue. You can provide it some hyperlinks, and it’ll generate a podcast primarily based on the hyperlinks. The podcasts had been attention-grabbing and fascinating. However additionally they had some limitations.

The issue with NotebookLM is that, whilst you can provide it a immediate, it largely does what it’s going to do. It generates a podcast with two voices—one male, one feminine—and provides you little management over the outcome. There’s an non-obligatory immediate to customise the dialog, however that single immediate doesn’t permit you to do a lot. Particularly, you may’t inform it which matters to debate or in what order to debate them. You’ll be able to strive, but it surely gained’t hear. It additionally isn’t conversational, which is one thing of a shock now that we’ve all gotten used to chatting with AIs. You’ll be able to’t inform it to iterate by saying “That was good, however please generate a brand new model altering these particulars” like you may with ChatGPT or Gemini.

Study sooner. Dig deeper. See farther.

Can we do higher? Can we combine our information of books and know-how with AI’s skill to summarize? We’ve argued (and can proceed to argue) that merely studying use AI isn’t sufficient; it’s worthwhile to learn to do one thing with AI that’s higher than what the AI may do by itself. You could combine synthetic intelligence with human intelligence. To see what that will appear to be in follow, we constructed our personal toolchain that offers us rather more management over the outcomes. It’s a multistage pipeline:

We use AI to generate a abstract for every chapter of a guide, ensuring that each one the necessary matters are lined.
We use AI to assemble the chapter summaries right into a single abstract. This step basically provides us an prolonged define.
We use AI to generate a two-person dialogue that turns into the podcast script.
We edit the script by hand, once more ensuring that the summaries cowl the best matters in the best order. That is additionally a possibility to right errors and hallucinations.
We use Google’s speech-to-text multispeaker API (nonetheless in preview) to generate a abstract podcast with two individuals.

Why are we specializing in summaries? Summaries curiosity us for a number of causes. First, let’s face it: Having two nonexistent individuals talk about one thing you wrote is fascinating—particularly since they sound genuinely and excited. Listening to the voices of nonexistent cyberpeople talk about your work makes you are feeling such as you’re residing in a sci-fi fantasy. Extra virtually: Generative AI is certainly good at summarization. There are few errors and virtually no outright hallucinations. Lastly, our customers need summarization. On O’Reilly Solutions, our prospects often ask for summaries: summarize this guide, summarize this chapter. They need to discover the data they want. They need to discover out whether or not they really want to learn the guide—and if that’s the case, what elements. A abstract helps them try this whereas saving time. It lets them uncover rapidly whether or not the guide will likely be useful, and does so higher than the again cowl copy or a blurb on Amazon.

With that in thoughts, we needed to suppose by means of what probably the most helpful abstract can be for our members. Ought to there be a single speaker or two? When a single synthesized voice summarized the guide, my eyes (ears?) glazed over rapidly. It was a lot simpler to take heed to a podcast-style abstract the place the digital individuals had been excited and enthusiastic, like those on NotebookLM, than to a lecture. The give and take of a dialogue, even when simulated, gave the podcasts power {that a} single speaker didn’t have.

How lengthy ought to the abstract be? That’s an necessary query. Sooner or later, the listener loses curiosity. We may feed a guide’s whole textual content right into a speech synthesis mannequin and get an audio model—we might but try this; it’s a product some individuals need. However on the entire, we anticipate summaries to be minutes lengthy somewhat than hours. I would hear for 10 minutes, perhaps 30 if it’s a subject or a speaker that I discover fascinating. However I’m notably impatient after I take heed to podcasts, and I don’t have a commute or different downtime for listening. Your preferences and your scenario could also be a lot completely different.

What precisely do listeners anticipate from these podcasts? Do customers anticipate to study, or do they solely need to discover out whether or not the guide has what they’re searching for? That relies on the subject. I can’t see somebody studying Go from a abstract—perhaps extra to the purpose, I don’t see somebody who’s fluent in Go studying program with AI. Summaries are helpful for presenting the important thing concepts introduced within the guide: For instance, the summaries of Cloud Native Go gave overview of how Go could possibly be used to deal with the problems confronted by individuals writing software program that runs within the cloud. However actually studying this materials requires taking a look at examples, writing code, and practising—one thing that’s out of bounds in a medium that’s restricted to audio. I’ve heard AIs learn out supply code listings in Python; it’s terrible and ineffective. Studying is extra possible with a guide like Facilitating Software program Structure, which is extra about ideas and concepts than code. Somebody may come away from the dialogue with some helpful concepts and probably put them into follow. However once more, the podcast abstract is simply an outline. To get all the worth and element, you want the guide. In a latest article, Ethan Mollick writes, “Asking for a abstract is just not the identical as studying for your self. Asking AI to resolve an issue for you is just not an efficient option to study, even when it feels prefer it must be. To study one thing new, you’re going to should do the studying and pondering your self.”

One other distinction between the NotebookLM podcasts and ours could also be extra necessary. The podcasts we generated from our toolchain are all about six minutes lengthy. The podcasts generated by NotebookLM are within the 10- to 25-minute vary. The longer size may permit the NotebookLM podcasts to be extra detailed, however in actuality that’s not what occurs. Reasonably than discussing the guide itself, NotebookLM tends to make use of the guide as a leaping off level for a broader dialogue. The O’Reilly-generated podcasts are extra directed. They comply with the guide’s construction as a result of we offered a plan, an overview, for the AI to comply with. The digital podcasters nonetheless categorical enthusiasm, nonetheless herald concepts from different sources, however they’re headed in a route. The longer NotebookLM podcasts, in distinction, can appear aimless, looping again round to choose up concepts they’ve already lined. To me, at the least, that looks like an necessary level. Granted, utilizing the guide because the jumping-off level for a broader dialogue can be helpful, and there’s a stability that must be maintained. You don’t need it to really feel such as you’re listening to the desk of contents. However you additionally don’t need it to really feel unfocused. And if you’d like a dialogue of a guide, it’s best to get a dialogue of the guide.

None of those AI-generated podcasts are with out limitations. An AI-generated abstract isn’t good at detecting and reflecting on nuances within the unique writing. With NotebookLM, that clearly wasn’t below our management. With our personal toolchain, we may actually edit the script to mirror no matter we wished, however the voices themselves weren’t below our management and wouldn’t essentially comply with the textual content’s lead. (It’s debatable that reflecting the nuances of a 250-page guide in a six-minute podcast is a shedding proposition.) Bias—a form of implied nuance—is a much bigger problem. Our first experiments with NotebookLM tended to have the feminine voice asking the questions, with the male voice offering the solutions, although that appeared to enhance over time. Our toolchain gave us management, as a result of we offered the script. We gained’t declare that we had been unbiased—no person ought to make claims like that—however at the least we managed how our digital individuals introduced themselves.

Our experiments are completed; it’s time to point out you what we created. We’ve taken 5 books, generated brief podcasts summarizing every with each NotebookLM and our toolchain, and posted each units on oreilly.com. We’ll be including extra books in 2025. Take heed to them—see what works for you. And please tell us what you suppose!

Summarizing Books as Podcasts – O’Reilly

Bringing which means into expertise deployment | MIT Information

Google for Nonprofits to develop to 100+ new international locations and launch 10+ new no-cost AI options

NVIDIA CEO Drops the Blueprint for Europe’s AI Growth

PornX AI Assessment and Options

How location knowledge is remodeling the retail business

Md Sazzad Hossain

Related Posts

Bringing which means into expertise deployment | MIT Information

Google for Nonprofits to develop to 100+ new international locations and launch 10+ new no-cost AI options

NVIDIA CEO Drops the Blueprint for Europe’s AI Growth

When “Sufficient” Nonetheless Feels Empty: Sitting within the Ache of What’s Subsequent | by Chrissie Michelle, PhD Survivors Area | Jun, 2025

Apple Machine Studying Analysis at CVPR 2025

How location knowledge is remodeling the retail business

Leave a Reply Cancel reply

Recommended

Constructing a Private API for Your Information Initiatives with FastAPI

ByteDance Researchers Introduce DetailFlow: A 1D Coarse-to-Effective Autoregressive Framework for Sooner, Token-Environment friendly Picture Era

Categories

CyberDefenseGo

Recent

Ctrl-Crash: Ny teknik för realistisk simulering av bilolyckor på video

Addressing Vulnerabilities in Positioning, Navigation and Timing (PNT) Companies

Search

Welcome Back!

Retrieve your password

Summarizing Books as Podcasts – O’Reilly

Study sooner. Dig deeper. See farther.

You might also like

PornX AI Assessment and Options

How location knowledge is remodeling the retail business

Related Posts

Leave a Reply Cancel reply

Recommended

Categories

CyberDefenseGo

Recent

Search

Welcome Back!

Retrieve your password