Phil Marshall joins us to talk about Spoken, a platform that empowers authors to create immersive single, dual, and multi-voice audiobooks with digital narration.
Phil Marshall is a technologist, entrepreneur, and storyteller who thrives at the intersection of imagination and execution. He is the founder and CEO of Spoken, a platform that empowers authors to create immersive single, dual, and multi-voice audiobooks.
//Draft2Digital is where you start your Indie Author Career//
Looking for your path to self-publishing success? Draft2Digital is the leading ebook publisher and distributor worldwide. We’ll convert your manuscript, distribute it online, and support you the whole way—and we won’t charge you a dime.
We take a small percentage of the royalties for each sale you make through us, so we only make money when you make money. That's the best kind of business plan.
• Get started now: https://draft2digital.com/
• Learn the ins, the outs, and the all-arounds of indie publishing from the industry experts on the D2D Blog: https://Draft2Digital.com/blog
• Promote your books with our Universal Book Links from Books2Read: https://books2read.com
Make sure you bookmark https://D2DLive.com for links to live events, and to catch back episodes of the Self Publishing Insiders Podcast.
Kevin Tumlinson [00:00:02]:
You just tuned into the hippest way to start and grow your indie author career. Learn the ins, the outs and all the all arounds of self publishing with the team from D2D and their industry influencing guests. You're listening to Self publishing insiders with Draft2Digital.
Jim Azevedo [00:00:28]:
Welcome everybody to another rousing edition of Self Publishing Insiders. I am Shimaze Avito. I lead marketing here at Draft2Digital and today we're going to have a lot of fun because we are welcoming the CEO and founder of Spoken, Mr. Phil Marshall. Phil, it's good to see you again, buddy.
Phil Marshall [00:00:45]:
Jim, it's great to see you as well. It's great to see you at Novelist Inc. In Elvis costume today we coordinated our apparel which is nice and and so good to see you.
Jim Azevedo [00:00:57]:
Did I forget you that me being Elvis was not public information yet? Thanks a lot man.
Phil Marshall [00:01:02]:
Oh, they're photos. So I. I'm not the first to expose this.
Jim Azevedo [00:01:06]:
All right, take it easy, take it easy. We'll get to that later. Hey, I don't always read bios for our guests as our viewers know, but I'm going to read Phil's because listen to this everybody. This man has quite a varied background and it is remarkable. So let me just go through Phil's background here. So Phil Marshall is a technologist, entrepreneur and storyteller who thrives at the intersection of imagination and execution. He is the founder and CEO of Spoken, a platform transforming how authors and readers connect through AI powered audio storytelling. Spoken empowers authors to create immersive, single, dual and multi voice audiobooks.
Jim Azevedo [00:01:48]:
Now get this before Spoken, Phil co founded Conversa Health, a pioneering conversational AI platform that refined that redefined the relationship between patients and care teams. Conversa became one of the most trusted names in digital health and was acquired in 2021. Congratulations for that. His earlier work includes developing personalization technologies at WebMD, collaborative video editing tools at JumperCut, and content based health record threading at WellMed. Phil is also a hard sci fi author. His debut novel, Taming the Perilous Skies was released just last month. Again, congratulations. It explores the consequences of his original theory of persistence, which is a speculative physics framework linking quantum mechanics, gravity and time.
Jim Azevedo [00:02:38]:
A native of Wanamaker, Indiana, that I pronounce that white, right? Wanamaker. That sounds perfect for you, for Wanamaker. Like you want to make things. And of course that's where you're from. Wanamaker, Indiana. Philip earned his MD from Indiana University School of Medicine and a master's in public Health from Oregon Health and Science University before pivoting from surgery to technology. He now splits his time between Suave Island. I probably butchered that name.
Jim Azevedo [00:03:10]:
Feel free to connect me. And Oceanside Oregon. Whether building AI systems, exploring physics in his fiction, or amplifying the voices of independent authors, Phil is driven by the belief that the most fantastic world is the one we dare to imagine and create together. Where do you find the time?
Phil Marshall [00:03:31]:
Well, I mean I just had to find like 20 minutes to write that out. I mean that's really, you know, I didn't have to find much time for that.
Jim Azevedo [00:03:39]:
Well, all right, so let's, we're going to jump right into this because your background is so varied, but I want to spend a lot of time learning more about spoken and introducing it to our viewers here. So for listeners who may not be familiar, can you explain what spoken does and how it's different from audiobooks or text to speech type tools or other tools that they may already have come across?
Phil Marshall [00:04:04]:
Yeah, yeah, absolutely. So you're already on the right track. You're in the right department, you've walked into the right room. And so we are the, the AI audiobook company. And so when I sold my company in 2021 and I went to go finish my, my novel, the bug bit again, I met some very influential people that really opened my eyes to what was possible because I am an audio only reader myself and, and I love good, compelling immersive works and multi, multiple multi voice works in particular if they're done really well and seamlessly. And what I, and in trying to create that kind of work for my own writing a small story to start with, I found that the, the cost, the workflow, all of it was just very, very difficult. And so I predicted that AI narration was not only going to help writers to more affordably bring high quality works to audio, which is the fastest growing area of story consumption, but that among that, the part that's going to grow rapid, most rapidly is the part that's been inaccessible, completely inaccessible before. And that is things like duet narration and, and multi voice narration.
Phil Marshall [00:05:25]:
And so that's, that's panning out and, and audiences, readers are really resonating well with that and this new found accessibility of those kinds of narrations. So that's what spoken is. We help authors to be able to create great high quality audiobooks so that they can delight their readers.
Jim Azevedo [00:05:56]:
Okay, did you know from the beginning and I'll, I'll give our viewers out there a quick demo from one of the videos that Your company has produced in a moment here, so they can just get an idea. But I wanted to ask you, did you know from the very onset of the company, Phil, that multivoice, that capability would set you apart, would be a key differentiator for your company? Because that became clear in speaking to you and hearing you speak from the. When we were at Novelist Inc. That that has really set you apart.
Phil Marshall [00:06:28]:
That was the origin of the company. I had a little story told, kill your darlings that I wrote at Taos Toolbox, one of the preeminent speculative fiction workshops. And I wrote it in 48 hours for the second week of critique, right. 18 of us sitting around a table critiquing each other's works. And the little story was about six people who gathered at a chalet, get caught by the weather and. And they get led through this process by this scam artist. He's feigning being this great British writer, but he's actually faking his British accent. And there's a elderly Norwegian woman, there's a Russian man and his daughter, there's a gay couple from America.
Phil Marshall [00:07:04]:
And so this blend of six people with accents and all, and they start dying one by one. And so I knew I needed to bring that line to life through the voices. And so in creating spoken, that was the genesis. And I think whenever you, whenever you create something, you really have to have an inherent sense. If you're a product person like I am, you have to have an inherent sense of what the value is. And I lived and breathed it. I lived and breathed the challenge and how inaccessible that kind of delivery was. And then as things went on and people are starting to talk more and more about variations in how stories are delivered.
Phil Marshall [00:07:45]:
And certainly single narrator is still the tried and true, the gold standard. And single narrator is reasonably accessible to hire a voice actor, which is great and we encourage that. We think that's awesome. And yet sometimes audiobooks are just brought to life better, especially if they're very character driven by having that more dynamic exchange. And historically, a multi voice work, if you're talking about a standard novel, could have been 15,000 or $20,000 to produce. And now add on top of that the fact that that the romance community in audiobooks, the listeners are now demanding duet narration. Well, so you went from $5,000 or 3,000 or 5,000 somewhere in that range to produce a single narrator to now twice that to produce duet narration. And so accessibility of romance works with duet narration is now far, far greater for writers by using spoken.
Jim Azevedo [00:08:46]:
Okay, while I'M thinking about it. Is there a limit to the amount of voices that you can have in a single work? I know that you were. That you experimented. I heard at the conference that you had a hundred voices. One of your books.
Phil Marshall [00:08:59]:
My only book.
Jim Azevedo [00:09:01]:
Okay.
Phil Marshall [00:09:02]:
I have over 100 speaking characters in. In that all of them have been generated automatically by spoken around how I wrote the character. And so what we do is we will analyze every character. What are their, you know, what are their different. Just vocal, you know, vocal characteristics. And, and then we, we will help the, the user not help. They can just push a button and then we'll automatically generate that voice from either 11 Labs or Hume. And Hume.
Phil Marshall [00:09:35]:
AI and 11 Labs are both great partners of ours. I have over 100 of them in my book. Even if it's just the person that comes from the back of the crowd that says, excuse me. We generate voice for the guy who comes from the back of the crowd that says, excuse me.
Jim Azevedo [00:09:49]:
All right, let's. Let's give them a quick taste of what we're talking about here. I'm going to show everybody a quick video here. Hang on while I'm going to bring it right up. Down with AI voices, guys.
Phil Marshall [00:10:08]:
You all have AI voices.
Jim Azevedo [00:10:13]:
Huh?
Phil Marshall [00:10:21]:
Wait, I can speak Spanish?
Jim Azevedo [00:10:24]:
Pretty great sounding boys too.
Phil Marshall [00:10:26]:
Do I know kung fu?
Jim Azevedo [00:10:28]:
Sorry, no.
Phil Marshall [00:10:30]:
Oh, by the way, I'm AI too.
Jim Azevedo [00:10:33]:
Down with human voices. I'm an actual person. It's all good.
Phil Marshall [00:10:43]:
Come join the party. AI voices, voice, act of voices, custom voices, even your own personal voice. Clone every voice you can imagine. Ready to bring your stories to life. Write it, hear it, any story, any size, professional audiobooks. In a click, spoken.
Jim Azevedo [00:11:09]:
All right, that's. That's pretty amazing. So it's, it's not just AI Voices. You can bring your own voice. Then you'll take that author's voice and use it in the book as well. Yep.
Phil Marshall [00:11:22]:
We just have a short little script you can read. It's about a one minute read. And we'll go through a range of emotion in that script. And so you just store it as your personal voice and that's available to you exclusively to you from that point forward. Yeah. Or if you have it, if you have your voice on 11 labs, you can bring it on over and use it with us as well.
Jim Azevedo [00:11:43]:
Okay, I'm going to click over and get to a comment from our audience here real quick. Then I'm going to go through all my questions because I'm so selfish when it comes to questions. Questions. Quick question here from William. Thanks. For the question William William asks, is there reader demand for AI audiobooks?
Phil Marshall [00:12:01]:
Well, it's, it's really more about market demand for audio. That's really what it's about. It's not, it's not about AI audiobooks. It's about story and stories from authors, they love that. And audio is their preferred method of consuming that story. So just to give you some context, we say listening is the new reading. Because audio, just traditional audiobooks alone have been growing by more than 25% year over year. Now think about that.
Phil Marshall [00:12:28]:
That's a lot of growth. Ebooks are flat. Print is flat. Audio is racing forward. It's surpassed ebooks as in, in popularity. It's not to print yet, but print. Tried and true. It's going to be a while before anything catches up to print.
Phil Marshall [00:12:42]:
My point being is that especially for younger folks or for people like myself, audio is the only way that they will consume that story. And when authors have 50 works in their backlist, oftentimes or 100 or 150, we have one of our users who has 400 books in their backlist. Being able to bring those to life into audio simply meets readers where they are. And so it's not about being AI audiobook. I wouldn't be in this book business if I felt like, oh, good enough, right? AI audio, it'll get by, you know, and for some genres I think it will get you by because it's easy to, in like, you know, in some genres to get the reader into that. And even when it's, even when it's not all that, you know, perfect, it, it still, it does the job, right? And single narrator AI auto books, like, if you, if you look at like Joanna Penn and the works that she puts out, she does a great job and talks about it all the time. And, and yet those are good, me and my work, that's actually not good enough. I have to have perfect delivery.
Phil Marshall [00:13:54]:
So for me and in the world of spoken, it is not about demand for AI audiobooks, it's about demand for audiobooks and being able to deliver on that effectively and very cost efficiently in order to meet readers where they are with great quality and indistinguishable from human narrators. The reason we focus on multivoice is and duet narration is because it's so historically inaccessible. And when my book, Taming the Perilous Skies comes out on audio in a variety of different places, like Spotify, for example, for the holiday season, I, I'm banking on this. And so you can, you can Mark my words, it's not only going to be good, it's going to be Dungeon Crawler Carl good. It's going to be Project Hail Marry good. That's how good it is. And I'm not talking about from the story, judge the story on your own, but just from delivery of that story. It is, it is incredible.
Phil Marshall [00:14:53]:
And, and so it's, again, it's not about. Is there demand for AI audiobooks? There's demand for story through audio.
Jim Azevedo [00:15:01]:
Okay, good, good question. I'm going to bring up another quick comment here from Tom. Tom says through D2D, all of my books are with Apple audiobooks. That's great. Tom and I, so we're talking about the accessibility and the demand for audiobooks. I know that some retailers, they put a cap on digitally narrated audiobooks. So that's another point because not all readers have the good fortune to purchase audiobooks that are $20 $25 and up. And I believe Apple sometimes cap prices at $6.99.
Jim Azevedo [00:15:34]:
So it just provides another level of accessibility to the reader community as well. Yeah, go ahead. No, go ahead. I was gonna ask you another question, but please.
Phil Marshall [00:15:46]:
Yeah, it's, it, you know, the, the, the landscape is evolving rapidly. If you look at Spotify's announcement August 1, they allow all AI narrated, digitally narrated audiobooks. And you just, the author has to designate that. This is, you know, digital narration. And that's, that's all you need to do. And so that really kind of opened up in a mainstream way the availability of channels for delivering this to readers. Of course, there's a number of them that are wide open for you. You know, if you, if you as an author are using any of the different mechanisms.
Phil Marshall [00:16:23]:
Many, you know, most of which drafted digital, you know, will feed into. There are, you know, I would say now maybe about half of those that are fine with digital narration and another half that aren't yet, but they will be. The, the 800 pound gorilla in the room of course, is audible and we talk to them regularly. They're tired of seeing us at the conferences, you know, and, and bending their ear and someday even them probably through their KDP angle because they're already using virtual voice. But they require it, be it Amazon's tech. I won't talk about that tech. It's like, you know, and so, but, but the point is, is that the, the market's really opening up and these different channels are, are very much now available for, for digital narration.
Jim Azevedo [00:17:05]:
Cool. Cool. Why don't take a step back, Phil, and just Ask you, given your varied background, what inspired you to create Spoken in the first place? Was there at one point a personal experience that really sparked the idea for you?
Phil Marshall [00:17:21]:
Yeah, it was all that back at the Taos Toolbox that generated that story with those six characters, really the desire to bring that to life. And then I would go into the existing tools and I would just be, oh my God, like, this is impossible. Right? You have to use something like Final Cut Pro or Audacity or something to, to load in every single passage, get the timing just right, get the inflections just right. It was all distinct, you know, it was all. There was no, there was no narrative cohesion, there was no narrative context. And. And so it was. It.
Phil Marshall [00:17:51]:
The writing was on the wall to me, both a, on, on where the technology is going and how good it will be to the opportunity for multivoice and how that will expand so rapidly as being accessible now and just generally how, how good a quality that will be and that will just change the landscape on how writers can bring their works to the market. And so that, that was it though, the Taos Toolbox story that I, I shared.
Jim Azevedo [00:18:18]:
Okay, can you, can you kind of walk us through the technology? Like, how does the platform turn written content into audio that feels natural and engaging? And I know that the technology has got to be very robust to make it natural and engaging.
Phil Marshall [00:18:33]:
Yeah, yeah, yeah. Well, there's a lot that we do to make sure, make that possible. And then there's a lot that our, our partners, eleven Labs and Hume have done over the course of years, although not that many years. It's like you think of 11 Labs as this sort of almost household name now. You know, 3 billion dollar, you know, valuation and, you know, hundreds of millions of dollars in investment by increase in Horowitz and blah, blah, blah. They've actually only been around since 2022, which is what. What are you talking about? And so they've done an amazing job. Hume also an amazing job.
Phil Marshall [00:19:04]:
I find Hume's voices to be more natural sounding. I've chosen, by and large, I've chosen Hume's voices for my own work because of that naturalness of delivery. But it is harder to get right. What do we do to make that better, to make that easier? We take the story and if you're doing duet or dual or multi voice, we will parse that work and identify what is the speaking character for every different passage. Now if you think about normal character driven stories, like a, you know, my science fiction work or whatever, and you have multiple Multiple characters speaking. You, you have, you know, the speaking, the dialog tag, maybe some exposition, then a speaking again and then different person chimes in. We do all that automatically. We identify what the speaking characters are.
Phil Marshall [00:19:50]:
We attribute that character to that passage. We know exactly what they said. We do all the spacing automatically. And then we also layer in emot emotional cues that will help to make sure that out of the box that's, that's right, it comes out right. And especially when it comes to whispering, shouting, crying, laughing while the person's doing it, or even singing, we, we pass in that as an emotional cue that will drive that. That's not to say that it just comes out of the box just like this. If you're talking 11 labs and you're talking about single narrator, yeah it, you know, pretty much it'll come out out of the box without much need for change. You should always prove it, but there's not a lot of need for change there.
Phil Marshall [00:20:30]:
Duet or dual narration which just alternates chapter by chapter on who's narrating also same thing really. Out of the box. Eleven labs does a great job. Hume also does a great job. It's just, it takes more hand holding for every passage and especially on the multi voice. So this is what I like to tell folks. It will, depending on how much time you want to put into it, it will deliver a great end result. Like a very great end result.
Phil Marshall [00:20:56]:
And on top of that this is the worst it will ever be. So what spoken does is it provides that author focused workflow where authors for authors, that's the only market we serve. And so we've put that framework around creating that great audio. And then when you're done with your work, you proofed it, you hit the button, we master it to ACX levels on bitrate control, floor and ceiling sound control and all the requirements of like Spotify for authors and you know, you guys have audio standards and, and voices buy in audio and etc. And, and so we do all that work for you in so that you, you don't have to worry about all those things. So, so we again, it's the worst it's ever going to be. It's only going to get better and better and, and we love it. I mean, you know, and our authors love it.
Jim Azevedo [00:21:49]:
It's pretty fascinating. A quick question from Tom and then I have another follow up question. This is a good question. Does the narrator actually speak the words at some point or is it all computer generated or can it be mixed and matched?
Phil Marshall [00:22:04]:
Interesting. Well in our world, narration is, is text to speech. It's AI narration. The voices that you use may be custom, they may be your personal voice, they may be a voice actor from our library. Our library is nothing but I think now 130 or 150 voice actor voices, including some of the top voices out there. These are actual people that actually get paid for their, their actual narration. Right? We, you pay spoken, you know, for, for the narration for the work and then we pay that. And so all of our narration is, is digital narration.
Phil Marshall [00:22:44]:
It's just a matter of what voice you want to do it. And so if it's in your own voice, it can be. When I say narrator, you know, if it's multi voice, the narrator is the independent, you know, voice that you have voicing that. In my book, I have alternating chapters that half of them are my, my main character in first in. In first pov. The other half are in rotating third pov. And so those are all done by a separate independent narrator that I designed around the characteristics that I want. And the other ones are spoken by Jack, my main character in first pov.
Phil Marshall [00:23:18]:
So that's, those are the kinds of things that you can now do when you have this available to you.
Jim Azevedo [00:23:24]:
Very cool. Now, you mentioned editing before. You said that the first draft you get back is going to be, it's going to be almost there, like almost perfectly there. When it comes to narrating, if I'm the author and I get that fall back from you, do I have the opportunity and the capability to make changes? How does that work?
Phil Marshall [00:23:41]:
Oh, yeah, yeah, of course. And that's, that's one of the really transformative things here. Think about audiobooks historically and you identify something that isn't quite right.
Jim Azevedo [00:23:53]:
Right.
Phil Marshall [00:23:54]:
What do you do?
Jim Azevedo [00:23:55]:
Panic first.
Phil Marshall [00:23:56]:
Get panic first.
Jim Azevedo [00:23:58]:
Right.
Phil Marshall [00:23:59]:
Gotta get you back into the studio. I can't, I can't have that like spoken. I can't have that pronounced that way that, you know, I'm sorry, I didn't catch that in the pre. You know, but. But yeah. Whoa, man. It's right there in paragraph three.
Jim Azevedo [00:24:12]:
And anybody who's ever done podcasts or any kind of recording, you know, if you record something and then you come back and you try to make that sound match up perfectly, it's impossible.
Phil Marshall [00:24:21]:
You will not.
Jim Azevedo [00:24:23]:
Yeah, you won't.
Phil Marshall [00:24:24]:
It's so and so. Think about this. So now, just like you, it just as if you were to go into your ebook, right, your epub file and change, you know, that couple of words because you found like for Instance just yesterday, my book is on the market, right? It has been since September 12th. In chapter 10. It's, it's Jack's POV. It's first POV and I accidentally said Jack, blah, blah, blah, blah, as if it was third, right. Instead of I. And I'm like, whoa, right.
Phil Marshall [00:24:55]:
So what do I do? I change the word from Jack to I. And I regenerated that passage. Done. So the workflow, the workflow of audio becomes just as iterative, organic and dynamic as if you just go in and, and update your epub file, you know, and that changes everything.
Jim Azevedo [00:25:15]:
That changes.
Phil Marshall [00:25:16]:
Yeah, that changes everything. And, and so as far as editing. Yeah. You just see them all raid right out in front of you, every single passage, especially if it's multi voice. Right. And so you have your spacing, you have your emotional cue of every single passage. You can do volume change, speed change, emotional cue change. And, and so.
Phil Marshall [00:25:36]:
Yeah, and, and just regenerate. Sometimes the AI, especially if it's Hume humor, you listening to me right now? They, they, they have trouble over and over getting particular words right. And so you just, you, we. We give you the tricks on what to do there. And so you, you will end up with the right one. Just sometimes it takes a few tries. But. But yeah, editing, I mean what we call studio, spoken studio, that is the editing studio.
Phil Marshall [00:26:01]:
That's where you do that.
Jim Azevedo [00:26:02]:
That's so interesting. Yeah. I remember in the early days of self publishing with ebooks, you know, authors were reporting that their readers were, were reporting to them that hey, I, I found some typos in this section or something's just out of whack here. Oh, how cool it is that I can go into my, My Word doc, make those changes, upload the revised edition and then bang, it would be done. That ebook would be proliferated out there. And you can do that with audiobooks. Now that's pretty mind blowing.
Phil Marshall [00:26:29]:
So that, that's, you know, when people talk about the change, it's not just about making audio accessible for your backlist or what have you. And it's not just about enabling a more dynamic and compelling listening experience with things like duet, multi voice narration, the entire workflow of audio changes. And so yeah, I mean there's, and we can, we can address if you want, we can talk about the, the, the AI sentiment and anti AI sentiment. Sentiment that's out there because, because that, you know, that's, that's a very real, you know, issue. And we, we address that day in, day out. But, but it just changes Everything, you know?
Jim Azevedo [00:27:07]:
Yeah, that's super cool. I. Another question here from one of our viewers here from Antisocial Butterfly. Thank you for this question. She asks, do we have the ability with spoken to dual narrative with our voice and AI combined? So if I'm reading this correctly with a human narrator and also an AI, an AI generated voice, is that possible? Yeah.
Phil Marshall [00:27:29]:
So again, remember, at Spoken, it's all digital narration. But remember what I said about personal voice.
Jim Azevedo [00:27:37]:
Yeah.
Phil Marshall [00:27:38]:
You upload your voice or you read into our microphone just once, your little script right there. And your personal voice is now available to use on dual narration. And we're coming out soon with. Well, we offer duet narration now, but you really got to know how to do it. We're going to automate duet narration coming out very, very soon in the next couple weeks. And, and so if you want your banter back and forth, let's say you're the male narrator, you upload your voice, personal voice, you use that for the one character and choose, you know, maybe one of our female voice actor voices out of the library for the other one. But you want multi voice. In other words, every character has their own voice.
Phil Marshall [00:28:16]:
You can do that too. And so, yeah, all of that, very possible.
Jim Azevedo [00:28:21]:
Interesting. Okay, another quick question here. Like, how do you see spoken fitting into the broader publishing ecosystem? What I mean by that is, let's say I'm interested in making my audiobook, but I want to make a trailer or I want to do other things. Like, what else is. It's not just books, right? Like, I know you guys are working.
Phil Marshall [00:28:42]:
On some other stuff, but yeah, yeah, absolutely. So it's, it's what we call. It's what we call motion. And so, yeah, you're absolutely right. That, that is a, a very big deal. And so we, we do that. In fact, I don't know, Jim, if I shared that with you or not, but, but like, my trailer for Taming the Parallel Skies is just, just downright, you know, stunning. Yeah.
Phil Marshall [00:29:14]:
Yeah. And so it's, it's, it's pretty amazing. So, so, yeah, so we're going into motion assets as well. Just because you, you saw the ad, right? My team produced that. We didn't, we didn't use an outside firm to do that. My team has, that has that skills. I mean, Joshua Pull Pavato on our team leads that effort. He's just fantastic.
Phil Marshall [00:29:32]:
And, and so trailers are coming. We're doing, going to be doing some other big things on the motion side as well, but none of that is, is going to get in front of continuous perfection of the, the audiobook and audio workflow. So we're, we're going to be maniacally focused on that. Because I am maniacally focused on that.
Jim Azevedo [00:29:56]:
Hey, I, I've got to ask you. So this week, if anybody read the news or is watching TV or whatever, there has been a lot of, like, anger coming out of Hollywood regarding the AI generated actor Tilly Norwood. So a creator is seeking representation. So I'm sure you read the news stories too. So I want to ask you, you know what, what is your take on that? Because, you know, we believe, and you've already said earlier in this conversation that you believe human narration is the gold standard. So if there are any, if there's anybody watching or listening who is saying, well, these guys are going to take work away from human actors. Do you have a response to that?
Phil Marshall [00:30:34]:
Oh, sure, sure. Yeah, absolutely. So I do think on single narrator, you know, if you're talking about $3,000, $5,000 for, for your audiobook, that's, that's somewhat accessible. It's not that accessible, but I mean, you know, it's somewhat accessible. But if you've got a bunch in your backlist and, and you've got a number of works that you need to get out there in audio, you. You're really up against the wall, right? And so this is really all about writers bringing their stories, authors bringing their stories to readers. And, and so, yeah, of course we deal, we deal with this every day. And when people, you know, I'll, I'll share a story with you.
Phil Marshall [00:31:11]:
So when I launched my book and handing out advanced reader copies, I was at worldcon in Seattle. This was just back in mid August. And this is the biggest science fiction fantasy festival con in the world. And it happened to be in Seattle. I'm in Portland, so easy. And so we had a booth was spoken, and I shared it with Taming the Perilous Skies, my book. And, and so as I was, I was standing there and the spoken banner was over, you know, several feet away. There were protesters that came up and, you know, everybody had their phone filming and, and these two people were, were, were pasting no AI postage, like all over the banner.
Phil Marshall [00:31:54]:
And I asked them, you know, you know, what are you doing? Just, you know, they didn't know who I was. And they said, well, worldcon said that, you know, they wouldn't ever contract with unethical AI. I said, oh, well, what makes you think that that's unethical AI? And they said, well, they Steal, you know, voices from people, and then, you know, they don't get paid. I'm like, well, actually, spoken's given, you know, thousands of authors the ability to bring their works to life that they wouldn't have been able to otherwise because it's affordable. And not only that, but every voice in that library is a voice actor that gets paid with every, Every use. And. And they're like, oh, well, they should put that in their marketing material. And I said, should I? Should I put that in my marketing material? Really? Yeah.
Phil Marshall [00:32:39]:
You just put stickers all over a banner I paid for out of my checking account. Right. So, I mean, you know, so I think when people get into how the tech uses patterns of sound to written word, you become a lot more. A lot more comfortable that voices were not stolen to do that. And when you realize that voice actors monetize this with passive income also. And I have a prop here from that. I, I made a lion's mane out of the no eye stickers that they, they. That they put all over it.
Phil Marshall [00:33:10]:
I wore this. I wore this to the Hugo Awards. So that's you. That's my lion's mane. Hear me roar. But, but yeah, so.
Jim Azevedo [00:33:17]:
So was that the idea behind the, the Voice Unleashed commercial that we saw?
Phil Marshall [00:33:22]:
Oh, you know, here's the thing, right? When people realize that we're there, we're authors for authors, this is all we do. Right. And it's all about bringing your story to life and your story to a market in a very compelling way. And you realize there's real people behind it. Like, we're real. We're real people. We're a small team, but we're real people. And I think when you realize we're real people doing it for the right reason, and when you also think about other comparable efforts, like, for example, do you use AI to look for typos? If you use Grammarly, you do.
Phil Marshall [00:33:57]:
And if you use any of the tools that are available, Gemini or otherwise, you do. And so that's taking work away from editors, you know, and. And yet, of course, I mean, of course, you don't rely on something going out to an editor and a week later finding that you had, you know, a common. Instead of a period at the end of that. Right. I mean, Right. I mean, you're not gonna, you're not gonna, you're not gonna pay for that amount of money. You're not gonna wait for that kind of workflow and that kind of time.
Phil Marshall [00:34:24]:
Listen, it's. It's all that. That's right. Parse and parcel with what we're talking about here. Bringing your story to audiences in a compelling way, in a very high quality way that is affordable and accessible. And so it's, it's what we do. So.
Jim Azevedo [00:34:41]:
Yeah. Okay. Speaking of affordability question from Antisocial Butterfly, how much does it cost to use spoken?
Phil Marshall [00:34:50]:
So how does it work? Yeah, if, if, if you're a subscriber, it's $5 for every 5,000 words. And so if you have a hundred thousand word work, that's a hundred dollars.
Jim Azevedo [00:35:01]:
What is this? What is, what do you mean by. As a subscriber?
Phil Marshall [00:35:04]:
So $50 per month subscriber, but if you don't, that's fine too. It's just instead it's $10 for every 5,000 words. And so if you're a subscriber, that hundred thousand word work, a hundred dollars as a non subscriber, 200. So if you have enough volume, you know, you're producing enough volume to justify the 50amonth, you get 50% off narration. So that's how it works. We, we currently have the paywall before you actually can hear the sound, you know, which was just a bad decision on my part. You know, I make mistakes. And so here, here in the next, here in the next couple weeks, we're moving the paywall to where you can actually proof your entire work.
Phil Marshall [00:35:40]:
So, you know, the work that you've proved and gotten ready is going to be a hundred dollars for that hundred thousand words or a hundred dollars if you're a non subscriber for that 50,000 word work. And so there's not going to be any question. We're just, we're here to make it as easy as possible to create great quality works.
Jim Azevedo [00:35:56]:
Awesome. Phil, what has been your biggest challenge in scaling the business to date?
Phil Marshall [00:36:03]:
I always seem to hit trends pretty early, which on the one hand is great. You know, I'm not late to the game. And so we've been able to build some brand recognition, recognition, some brand trust with a lot of great, a lot of great authors, thousands of them. And we have a number of ambassadors now who are you going out and, and singing our praises. But I would say that has been one of the biggest challenges is being at the early, being on the cusp of this. You know, I, you know, I bootstrapped the company, right, the team and, and all that. And so we've had to keep things very, very tight. But now that we're growing, now that we're generating revenue, that's, that's becoming easier and easier.
Phil Marshall [00:36:44]:
And yet being on the early side of the equation means that it's still harder than it will be. It's still harder than it will be. And that's changing so rapidly. So we are focused, just absolutely maniacally focused on making it as easy and high quality as you can so you can push a button and out comes just a great, compelling story that your readers are going to love. And so being early is probably, probably one of the biggest challenges.
Jim Azevedo [00:37:13]:
Absolutely. I remember the early days of Smash Words and what happened next.
Phil Marshall [00:37:17]:
Yeah. Right when it, when it hits, it hits.
Jim Azevedo [00:37:20]:
Right when it hits, it hits. And then you're on a rocket, then you're on a rocket ship and you're just trying to, like when you want to make changes, it gets a little harder to make those changes.
Phil Marshall [00:37:28]:
Yeah.
Jim Azevedo [00:37:28]:
Where do you see spoken headed in the next three to five years? I know that's not always an easy question to answer, but.
Phil Marshall [00:37:36]:
Yeah, yeah. Well, you know, it's funny, I get that question a lot and I always remind people that I wouldn't have known a year ago that what I now have in the product would have been there. You know, like, I did not predict that that was there. Like the other, you know, I was proofing my own work. And in chapter three, I have an Italian, older Italian man who is the, the person who runs the largest anti gravity sculpture in the year 2076 in Milan in Parkes. And. And towards the end of the chapter, he sings. And so the emotional cue went in automatically.
Phil Marshall [00:38:17]:
It says singing. And what came out was a perfect sun passage. And I took that back to Hume and they did not know it would do that.
Jim Azevedo [00:38:29]:
Oh, is that right? That's freaky.
Phil Marshall [00:38:31]:
So, and so when it comes to three or five years. Yeah, I mean, you know, check with me. In five months, I bet half of my predictions are going to be gone.
Jim Azevedo [00:38:43]:
Let me ask it another way. Let me ask it another way. Are most of these changes being generated by you and the team, or do you feel that now that you're reaching sort of a critical mass, are you having a lot of feedback come from the actual users of the technology?
Phil Marshall [00:38:57]:
Well, there's no more power user than me, my team will regret to inform.
Jim Azevedo [00:39:02]:
Because sometimes we get ideas that we're not expecting from our authors and publishers or something like, oh, my God, that's a killer application for whatever that particular tool has to be.
Phil Marshall [00:39:11]:
Yeah. So I'm the hardest customer. And so, but, but yes, our authors all the time are coming back with needs and, and 95% of the time I'm like, yeah, I know it's on the roadmap because I found the exact same thing. And, and so, yeah, author feedback all the time now, which is great. We love that, we love our author users and, and they're like this, this part here, it's just not easy enough. It's just not good enough. And I'm like, I know, I agree. And it will be, I promise.
Phil Marshall [00:39:43]:
We're gonna be the ones to bring it to you.
Jim Azevedo [00:39:45]:
Right? Do you think we're moving toward an audio first publishing industry? Because we, you know, we've got podcasts and we've got audiobooks and I remember, well, I guess it was several years ago now, someone said, I can't remember who it was, that, you know, podcasts are the gateway drug to audiobooks. And I was like, yeah, they totally are. And now we've seen what it's been like 10 years now, probably 12 years now with audiobooks growing by double digits year over year over year. And granted, you're in the thick of audiobooks, but still, I want to get your take on. Do you think we're moving to become an audio first publishing culture?
Phil Marshall [00:40:25]:
Well, I'm in the thick of audiobooks because of that reason. I'm in the thick of audiobooks because I'm an audio only reader. I'm in the thick of audiobooks because of what I've seen from a trend perspective and what I observe in younger demographics. And so audio first, I mean, look, print is king, right? Those paperbacks, they're king. And yet audio just slid past ebooks earlier. This year's best. All of our stats show on, on sales, right? And so, and so are we going to become audio first? It might, it might be a while before it surpasses, if it ever surpasses print because there's that experience, right? Sitting and, and reading and all. But with the way lives, our lives are, right.
Phil Marshall [00:41:12]:
Always on the go, right?
Jim Azevedo [00:41:13]:
Right.
Phil Marshall [00:41:13]:
We're in the Uber. We're, you know, we're in the car, we're listening, we're on our commute, we're, you know, doing this and that. The younger demo for graphics are the indicator. And audio can no longer be an afterthought. It's always been an afterthought. It's always been a second class citizen because of its cost and because the drivers haven't been there of it being so in demand now it is, can no longer be an afterthought.
Jim Azevedo [00:41:42]:
Yeah, it's funny, I remember the early days of Smashwords when Like, things really started taking off and some of the pundits out there were like, oh, ebooks are going to overtake print. It's going to be an ebook centric industry. No, they went to a point near and then just. They leveled out and it's been like, flat. Yeah, it is flat, but it's still king. Yeah, there's.
Phil Marshall [00:42:06]:
There's a big difference. And so the big difference is that ebook, while accessible and, and straightforward from just a material standpoint. You got one little device here. Right. That's still sitting and reading.
Jim Azevedo [00:42:19]:
Absolutely.
Phil Marshall [00:42:20]:
And experiencing that story in a particular way. Audio is, is reading, by the way. And you will not, you will not convince me otherwise. Right. Audio books are reading. It's not listening, it's reading. And, and that is going to continue to rise so dramatically because it's reading in a different way. It's reading at a different time, It's a reading in a different circumstance that is more and more in line with how we live our lives.
Phil Marshall [00:42:47]:
And so I won't go as far as to say that audio will out strip reading something, but it's certainly going to be an interesting time because it's growing so rapidly. And I do think it fits a different need.
Jim Azevedo [00:43:05]:
I'm one of those nerds that I will get the audiobook and the print book. I read a lot of nonfiction, and so it works almost like, gosh, like a. Once you punch in a way, like, I could be out walking the dog and then I'll finish that and if I have time, I'll pick up the book or if I'm listening to the audiobook, I remember, I'll think to myself, didn't I read that section? I could picture myself reading the book earlier. It's just, it's weird. And it just really, really helps it sink in. Maybe it's just my brain because I need to read things over and over and over again in order to absorb them.
Phil Marshall [00:43:39]:
Yeah. From, from our experience, the number of people who like the bi, you know, the bimodal story delivery is pretty, pretty small. That's a pretty, pretty small percentage. So you're, you're in pretty rarefied air there. But, but audio, you know, it's usually one or the other. It's usually either your audio, which. Which generally trends towards the younger demographics, you know, like me. Right.
Phil Marshall [00:44:00]:
I mean, you know, I'm so young, I was ahead of the curve, but. Or your. Or your print or your reading. Right. Whether it be ebook or print. So. So, yeah, I think generally the audiences are evenly split. I will also say, when it comes to fiction readers, because we've done a lot of work around this, those who desire multi voice works are split with those who prefer single voice works.
Phil Marshall [00:44:23]:
Of course they have to be really compelling. You have to be lost in story. They can't be distracting. Right. And that's, that's the problem sometimes with, with multi voice or with sound effects, which we don't do, it can be distracting. And so the goal is narrative cohesion, telling that story in a way that allows that reader to be lost in story. And that's what it's all about. And audio is racing ahead.
Jim Azevedo [00:44:48]:
Okay, I know you got to get going pretty soon. I want to bring up maybe one or two more questions. Just real quick here, Phil, if you have time.
Phil Marshall [00:44:54]:
Yeah.
Jim Azevedo [00:44:55]:
From Nisocial Butterfly, she asks, do we have the ability to let others use our voice for their books?
Phil Marshall [00:45:02]:
So interesting. We've actually been asked that by a number of folks. If your voice is on 11 labs, we can use it and it can be used unspoken for whoever you want. You'll just be in the library. Okay. As far as, as far as allowing a particular user to use your voice, I think that's a great feature enhancement. We've actually been asked that a number of times. And so I think that's, that's going to come at some point.
Phil Marshall [00:45:24]:
Or you can just put your voice at 11 labs and we can put it in the library and have it available for everybody if you want.
Jim Azevedo [00:45:30]:
Very cool. Okay, final question, because I know you have to jump and go to another meeting here pretty soon. What excites you the most about the future of spoken word technology for authors and publishers?
Phil Marshall [00:45:41]:
I think what excites me the most is how this breathes life into the indie authors works and their backlist in particular, those things which wouldn't have otherwise been able to meet this audience, this growing audience. And so being able to finally allow them to meet their audience with this modality of story delivery, that's what excites me the most. And when you kind of build on that, the sort of multimodal aspect of the visual eventually as well, that excites me. It's all very exciting. And for me personally, with my own work coming out in the holiday season in audio, where I keep saying it's going to be Dungeon crawler Carl Good. And that's from an audio quality perspective, I'll let you judge the story. I think the story is good too.
Jim Azevedo [00:46:27]:
But.
Phil Marshall [00:46:30]:
What it does for me personally is tremendous. And so I'm excited to help all authors be able to experience that and being able to deliver their story that they've always wanted to deliver in a way that doesn't break the bank.
Jim Azevedo [00:46:44]:
Right. That's very cool. And again, everybody, we're talking here about spoken, and that's where you can find more information about the company. Just real quickly, Phil, do. Do accents, do foreign accents create any kind of a barrier, any kind of a special challenge?
Phil Marshall [00:47:02]:
That is the opportunity. That is not. That is not a barrier challenge. All of League 11 labs voices speak 32 languages. Humans voices speak 10. You're going to be covered. You're going to be covered on. On your language.
Phil Marshall [00:47:15]:
So. So over 100 speaking characters in my book. I have so many different. I just naturally write with different accents for whatever reason. I don't, you know, I guess, make it hard on myself. But it's also what drove spoken is that desire to have that dynamic nature of the sound of those voices be so distinct. In fact, what I'll. I'll.
Phil Marshall [00:47:36]:
I don't want to take up too much much your time here, but. But in going through my work, I changed voices because the mix wasn't what I wanted, right? If I had three female characters there, you know, one sort of, you know, younger, military, a little bit gravelly sounding, and the other is. Is more like, you know, blue collar, like, you know, trucker, you know, and grandmother. And if their voices are too similar in the same scene, I just go and I redesign one of them and I say, no, this is going to be more of a soothing grandmother with a Bostonian accent, right? And all of a sudden, boom. And it's. It's immediately distinctive. And so changing the mix of voices is great fun, first of all, to hear your. Your characters come alive.
Phil Marshall [00:48:16]:
But then you're doing things like getting rid of dialogue tags because you know who's talking. So you just get rid of the dialogue tag. So audio becomes. Now, I talked earlier about doing first POV for half my chapters and third POV with an independent narrator for the other half. That getting rid of dialogue tags, changing your voice mix to make sure that they're all mixed. These are the kinds of things that I now think about, about on a regular basis with my work. And our authors are now thinking about for their work because it's now possible.
Jim Azevedo [00:48:44]:
This is fascinating. I mean, this has been a fascinating conversation even for those folks who are like anti AI. I just, I wanted to have this conversation with you just to introduce the topic and the technology so that everyone here in the indie community knows what's happening in the industry and keep on top of it. So Phil, I'm so glad that you have the time to spend with us today.
Phil Marshall [00:49:05]:
It's a great pleasure.
Jim Azevedo [00:49:06]:
Thank you. It's been my pleasure. Going to run through just a few, a few housekeeping items here, folks. If you enjoyed this episode, if you can share it with your friends. We really, really appreciate that because it helps us get the word out about self publishing insiders and about companies like Spoken and about Draft2Digital as well. If you want to know what our topics are in future episodes or who our guests are going to be bookmarking dddlive.com and you can keep an eye on that. Finally, anybody who is watching today or listening who may be new to the self publishing world, you can sign up for a free Draft2Digital account simply by going to draft2digital.com I want to thank our guest Phil once again for hanging out with me. Phil, if you've got like 30 seconds, hang out backstage.
Jim Azevedo [00:49:53]:
We'll grab a snack back there. We'll talk shop for everybody else. I want to thank you sincerely for joining us week in and week out. And for our new viewers out there, welcome. We hope to see you again again here soon. Take care everybody. We'll see you next week. Thanks again, Phil.
Kevin Tumlinson [00:50:11]:
Ebooks are great, but there's just something about having your words in print. Something you can hold in your hands, put on a shelf, sign for a reader. That's why we created D2D Print, a print on demand service that was built for you. We have free beautiful templates to give your book a pro look and we can even convert your ebook cover into a full wraparound cover for print. So many options for you and your books and you can get started right now@draft2digital.com that's it for this week's Self Publishing Insiders with Draft2Digital. Be sure to subscribe to us wherever you listen to podcasts and share the show with your will be author friends and start build and grow your own self publishing career right now@draft2digital.com.