This Crazy Writing Life: AI And Indie Pubbing—Is This The End Of The World As We Know It?
By Steven Womack
Want to read a book that’ll scare the bejeezus out of you? Grab a copy of If Anyone Builds It, Everyone Dies—Why Superhuman AI Would Kill Us All. The authors—Eliezer Yudkowsky and Nate Soares—have studied artificial intelligence for decades and have reached the conclusion that if we keep going the way we’re going, AI will soon be smarter than we are. The next step is for it to become sentient and when AI is able to perceive, feel, and outsmart us, it will ultimately get into conflict with all us mere humans.
Then guess what? We’re toast…
Is that the way this is all going to play out? Who knows? As Yogi Berra once said: “It’s tough to make predictions, especially about the future.”
One thing I do know is that the whole AI thing is taking up more and more of our bandwidth each day. Major corporations are laying off tens of thousands of workers and replacing them with AI. From driver-less taxis to robotic Door Dash deliveries and fast food cooks, AI seems to be on everybody’s mind. Try calling a large corporation, hospital, or customer service center, hoping to reach a human. It’s harder than ever.
It’s no different in the publishing world, especially in the indie-pub space.
I’ve been lucky in that I’ve been able to attend the last four Novelists Inc. annual conferences. At each one of those conferences, the issue of AI in indie-pubbing—especially AI-narrated audiobooks—has been front-and-center. Is AI going to put human audiobook narrators out of business? Do we need a new army of Luddites smashing the machines to protect the paychecks and lifelines of the modern-day equivalent of textile workers.
Again, I’ve given up prognosticating. I’m usually wrong anyway.
But I can make some observations, and you can draw your own conclusions from them. Let’s start with AI-narrated audiobooks.
First, a brief history. In 1976, Ray Kurzweil unveiled the Kurzweil Reading Machine, the first modern text-to-speech synthesizer. He originally envisioned the machine as a way for blind people to have access to text (Stevie Wonder bought the first one). By 1988, the Apple Macintosh had an effective TTS (text-to-speech) capability, and development has continued to this day.
By far, the that biggest hurdle to creating audiobooks—especially for indie authors—is the production cost. Costs vary widely, depending on a number of factors, but professional audiobook narrators with credits typically charge from $100-300 or more per finished hour. Studio rental, editing, and mastering the files can add significantly to the cost.
For audiobook producers who choose to have multiple narrators, sound effects, etc., costs can double or even triple.
Not only are the costs out-of-reach for many indie authors, the ROI is often simply not there. A ten-book series by an indie author can easily cost $35-50K to produce and publish. Obviously, in a competitive marketplace where discoverability is also an issue, one must sell an enormous number of audiobooks just to make back the production costs.
The evolution of digitally narrated audiobooks has rocketed into high gear in the past few years. In March 2021, Hume AI began developing AI platforms that analyzed vocal inflections and facial expressions that better gauged human emotional states and could create more human-sounding voices (and AI characters).
In 2022, a machine learning engineer and an ex-Palantir deployment strategist—both from Poland—created ElevenLabs, motivated by what they felt were American films badly dubbed into Polish. In January 2023, ElevenLabs’s beta platform went public. Since then, a number of versions have been developed and deployed.
Today, ElevenLabs is leading the charge on realistic digital voice narration for audiobooks. They have a library of hundreds of voice samples. You can even create an ElevenLabs account and upload of sample of your own voice. It goes in the library and if someone likes your voice, they can choose you. Only you won’t actually narrate the book. ElevenLabs will synthesize your voice based on the sample and narrate the whole book and you’ll get a small licensing fee.
In November, 2023, Amazon rolled out an invitation-only KDP Beta test for digitally narrated audiobooks. Early results were considered by many to be problematic. The only appealing thing about it was that it was actually free (but you could only sell your audiobook on Amazon).
At this year’s NINC conference, I had the chance to sit in on a panel presented by Dr. Phil Marshall, the founder and CEO of a company called Spoken, which is the latest contender in the digital narration sphere. Marshall—who’s an M.D. and a surgeon who left the field of medicine to pursue a career in AI development— founded Spoken two years ago, a company whose mission is to make the most realistic and effective AI-narration available to authors at a reasonable cost.
“Listening is the new reading,” he explained. “Half of all Americans listen to spoken word media every day.”
Marshall then demonstrated the Spoken platform, which works on multiple levels. Authors can choose totally digital narrator voices, or they can use voices of real actors, whose voice samples are then synthesized and replicated by the AI platform to speak the text in the audiobook.
He emphasized the editing capabilities of the platform, which enables authors to manipulate voices at a single-line level. If an author doesn’t like the inflection or pacing of a delivered line of dialogue, for example, he or she can go so far as to record the line the way it should be delivered. The Spoken app then analyzes the author’s reading of the line and regurgitates it in the digital voice.
Marshall then outlined his company’s strategic partnership with ElevenLabs and Hume AI, in which authors using the Spoken platform can have access to literally hundreds (if not thousands by now) of voices available on those platforms.
This flexibility, combined with the pricing structure, even makes multi-voiced cast recordings accessible and affordable. In Marshall’s view, he noted, this represents one of the greatest opportunities for indie audiobook producers. He demonstrated a project he’s working on now—his own novel Taming the Perilous Skies—which will contain over 100 voices.
Spoken’s pricing structure offers two different options. Authors can work on a per project basis, which offers an unlimited number of voices, custom voices, full access to the Spoken studio, project download, and audio mastering at a price of $10 per 5,000 words. For multiple projects, authors can subscribe for $50/month, with 50% off all narration costs.
So there you have it, folks. A human-narrated audiobook can easily cost $3500-$5000 to produce. A 100,000 word digitally narrated audiobook will cost a couple hundred to get out there. When you take into account the digitally narrated audiobook will sound about 90 percent human, that’s not a bad compromise. And I don’t think we’re too far away from a place where you’ll almost have to be an audio expert to tell the difference.
The question remains for many people is whether or not this is morally and ethically right. If you look at technical revolutions throughout history, they have always disrupted the status quo. In the 19th century, the Luddites were textile workers rebelling against the automation in mills. Did that stop the process?
No, but it created a whole new segment of industrial jobs. Somebody had to operate those mills. Textile workers became machine operators in a factory rather than sitting at home with a traditional loom. And while Henry Ford did put a lot of blacksmiths and buggy whip makers out of business, in the end I think it’s safe to say he created more jobs than the ones he eliminated.
Besides, blacksmiths are still around, and I’d speculate that they’re making more than ever.
Another way to look at it is if I produce an AI-narrated audiobook, have I caused an audiobook narrator’s children to go hungry? No, because I can’t afford the human narrator in the first place. I drive a KIA; that doesn’t mean I took a Cadillac worker’s job. I can’t afford a Cadillac to begin with, not to mention I wouldn’t be caught dead in one.
Nearly twenty years ago, many gurus railed that the advent of the eBook industry spelled doom for print books. But are print books dead? No, they’re more popular than ever before.
So if you’re a human audiobook narrator and voice-over artist, do you need to be looking for a new career? I don’t think so. Human voices are always going to be needed, even in audiobook narration.
Two years ago at the NINC conference, I had a conversation with USA Today best-selling author Sylvia McDaniel, a hybrid author who’s penned over 100 romance novels. She’s very successful and a delightful person to be around. I’m genuinely fond of her.
She told me that her approach is to produce two audiobook versions of her novels. The human-narrated version is priced as a traditional audiobook—roughly the $10.99-on-up range—and a digitally narrated book for as little as $.99 with an Audible membership.
So if you’re an audiobook consumer and want the joy of hearing Tom Hanks narrate the latest best-seller, then you can shell out a little more for that privilege.
But if you’re just looking for somebody to read you the dang book while you’re driving to and from work, then that option comes a lot cheaper.
Does any of this sound familiar?
During the Great Depression, a lot of people couldn’t afford food and clothes, let alone expensive hardbound books. In 1935, a London publisher named Allen Lane came up with an idea to make books more accessible and affordable. He created a universal format that was cheap to produce and would easily fit into standardized wire racks that could be placed in any retail space, not just bookstores.
He founded a company—Penguin Books—to move this idea forward and the mass-market paper was born. For the next seventy years—until the advent of the eBook that replaced it—the mass market paperback was the chief medium for both fiction and nonfiction sales.
I think we may see something very similar to that in audiobooks.
***
But it’s not just audiobooks. What else is expensive to produce for an indie author?
Foreign translations…
With Amazon.com in practically every corner of the globe, marketing eBook translations can be a lucrative revenue stream. Only it costs a boatload of money to hire a translator and there’s no guarantee you’ll ever see a decent ROI.
While at the NINC conference, I met indie authors who are using a company called ScribeShadow to produce AI-translated foreign editions. I spoke with a few authors who have used this service and have been very happy, especially given the 90 percent-plus savings in creating the foreign work.
What about the quality? Idioms and inflection? The nuances of slang and regional dialect? I once had to explain to a Japanese translator that my use of the Southern idiom slicker than snot on a doorknob didn’t mean there was literally mucous on the door handle. Yes, I agreed, that would be very unhygienic.
One author explained to me that when you produce a foreign language eBook, if the translation sucks, readers will beat you to death in the reviews. She’s done a number of German translations—without speaking a word of German—and so far, her reviews have been positive.
This author doesn’t even use German proofreaders to check the translation. She told me she feeds an English manuscript into the ScribeShadow AI platform, and a German translation pops out the other end. Then she feeds the German translation into ChatGPT for a final check.
There you have it, folks; a foreign edition of your English masterpiece that’s entirely untouched by human hands.
As I’ve said so many times over the last year-and-a-half of writing these columns, it’s a whole new world out there.
As always, thanks for playing along.
***
Wait! Stop the presses! The day after I turned this column in, Amazon announced via PublishersLunch that they’ve launched an AI translation service for indie authors publishing through KDP. It’s currently in Beta and will convert books from English and Spanish and from German to English (not sure exactly what that means), with more languages to be added soon.
To quote from Amazon’s announcement: With less than 5% of titles on Amazon.com available in more than one language, Kindle Translate creates opportunities for authors to reach new audiences and earn more…Within a few days, authors can publish fully formatted translations of their books. All translations are automatically evaluated for accuracy before publication, and authors can choose whether to preview or automatically publish completed translations.
And, like KDP’s digital audiobooks option, the service is free.
See what I mean, folks? Things are changing so you have to update columns before they’re even published. I’ll do some more digging and report back in next month’s edition. Best guidance going forward—jump in and hang on!