This is not a niche format — it is a mainstream one in rapid ascent.
On 19 February 2026, Madrid hosted the third annual Parix Audio Day – a conference convened by the Germán Sánchez Ruipérez Foundation in collaboration with the Spanish Ministry of Culture, and organised by Luis González and Javier Celaya.
The event drew professionals from across the audiobook, audio content, technology, and publishing sectors. Erin Cox of Publishing Perspectives filed a must read report. What follows uses that reportage as a springboard for a more extended examination of the issues it raises – issues that touch on data, strategy, technology, labour, market development, and the evidently complicated question of how publishers should feel about AI.
The short answer to that last question is: less anxious and more strategic. Luddite Fringe, move along. There’s nothing for you here. Everyone else: Keep reading.
The audio boom is real, the AI tools are maturing faster than most in the industry recognise, and the window for proactive positioning is open now. Publishers who wait for certainty will find themselves following markets that others helped create.
Yeah, nothing new there, true. But AI is transformatively different from any other tech challenge this past century and a half. Not least in its acceleration. No tech transition has ever moved this fast. And the momentum is still building. We can choose to rise the tsunami of opportunity unfolding, or be left stranded by it.
The Market That Should Not Surprise Anyone
Let us begin with some numbers that ought to concentrate minds. In 2024, US audiobook sales revenue reached $2.22 billion – a 13% increase year-on-year and a return to double-digit growth after a brief moderation. Digital audio now accounts for 99% of that revenue.
The Audio Publishers Association’s 2025 Consumer Survey found that 51% of Americans aged 18 and over – an estimated 134 million people – have listened to an audiobook. Meanwhile, interest among non-listeners has risen markedly: 38% now say they are interested in trying audiobooks, up from 32% the previous year, with the proportion describing themselves as “very interested” nearly doubling from 10% to 18%.
The picture in the UK is similarly striking. Audiobook revenue reached £268 million ($361.6 million) in 2024, a record high representing a 31% increase from 2023. UK publisher Bloomsbury reported audio sales growing 57% in its 2025 fiscal year; Lagardère noted a 38% increase in digital audio across 2024; HarperCollins recorded a 13% year-on-year rise in audiobook sales in the fourth quarter of 2024. Spotify, since launching audiobooks within its premium subscription in 2023, has seen its English-language catalogue nearly triple to over 400,000 titles, with listening hours rising 35% year-on-year globally.
These are not niche figures. This is a format in structural ascent. And yet, across much of the publishing industry, the response has been cautious at best – a reactive watchfulness rather than the confident forward-leaning investment the data would appear to justify.
This is not a niche format – it is a mainstream one in rapid ascent.
51% of Americans have listened to an audiobook. UK audiobook revenue rose 31% in 2024. This is not a niche format – it is a mainstream one in rapid ascent.
Some of that caution has a reasonable basis. Production costs are significant, catalogue is finite, and the relationship between audio and other formats is not fully understood. But the Parix Audio Day offered some important reassurance on that last point – reassurance that, if borne out by harder data, changes the investment calculus considerably.
The Cannibalisation Question – and Why It May Be Settled
For years, one of the most persistent anxieties among publishers has been that audiobooks might cannibalise print sales – that a listener is a reader foregone. The Audible survey data presented at Parix by Barbara Knabe, the company’s Head of Content Acquisition for the EU, challenges this directly.
Among European audiobook consumers surveyed: 80% also read print books; 68% reported reading more print or digital books since beginning to listen to audiobooks; and around 60% said they subsequently purchased the physical edition of a title after hearing the audio version.
These figures are significant, but they come with an important caveat: they are survey data, not sales data. Surveys capture intention and self-report; they do not track actual purchase behaviour at scale. For a conservative publisher making a board-level investment decision, survey data showing that 60% of audiobook listeners claim to buy the print edition is provocative but not definitive.
What would make this definitive is longitudinal purchase data – tracking actual buying behaviour across individuals and formats over time. Some of that evidence does exist at the platform level, particularly within Amazon’s integrated ecosystem where Audible and print purchasing patterns can be observed together.
The fact that this data has not been widely published may say more about competitive sensitivity than about what it shows. We all know Amazon does not share data as a rule, but we can infer from its business actions and the survey data is has chosen to share, that this is good for business
Publishers seeking to make the investment case internally would do well to push the major platforms for this kind of hard evidence, rather than relying solely on consumer survey responses.
What we can say with greater confidence is that the macro-level numbers do not support cannibalisation. Print book sales in the US totalled 782 million units in 2024, growing 23% over the preceding decade – a period during which digital audiobooks rose from effectively-zero to a $2.22 billion industry. If audiobooks were systematically substituting for print, we would expect to see corresponding decline. We do not.
Duncan Bruce of Spotify reinforced the point at Parix simply and directly: audiobook listeners are readers. Features like Page Match – Spotify’s tool that allows seamless switching between audio and print editions via Bookshop.org – are premised on exactly this insight, that listeners and readers are often the same people in different moments.
A TNPS deep dive into the convergence of formats here.
The Platform Landscape: Rules, Restrictions, and Strategic Implications
One of the more practically important discussions at Parix concerned the distribution landscape for AI-narrated audio. George Walkley, an AI and audio publishing expert, flagged a significant constraint for publishers considering AI narration: platform policies vary considerably, and the dominant player – Audible – has historically applied the most restrictive approach.
The picture as of early 2026 is this: Audible operates a two-track system. Its consumer-facing platform has introduced “Virtual Voice” – an AI narration feature available on a select and growing number of titles, labelled as such for listeners. However, Audible’s publishing arm, ACX (Audiobook Creation Exchange), which is the route for most self-published and independently produced titles, still prohibits AI-generated narration at the point of submission.
The exception is a programme Audible announced in May 2025, inviting selected publishers to work with Audible’s own proprietary AI narration tools, with the AI translation service extended to Spanish, French, Italian and German in beta later that year.
The strategic implication is clear: publishers wishing to place AI-narrated titles on Audible – which controls roughly 63% of the US market and a similarly dominant share of the UK market – must use Audible’s own AI tools (which requires being a selected partner).
This is a significant market access constraint that the industry has not fully grappled with, and that goes beyond market share and AI-narration acceptance or otherwise. Bear with me.
The landscape outside Amazon is considerably more open. Google Play Books officially supports AI-narrated audiobooks. Kobo Writing Life accepts AI narration without restriction. Spotify (via its former Findaway Voices/INaudio distribution arm) and Apple Books accept AI-narrated content, typically requiring disclosure. ElevenLabs, which provides the technology platform used by publishers including AkooBooks, imposes no distribution restrictions on the narration it produces. Storytel, a major subscription platform particularly strong in Scandinavia and India, has taken a broadly open approach to AI narration with disclosure requirements.
Platform Summary (as of early 2026)
Audible/ACX: Restricts AI narration except via its own proprietary tool for selected partners. Virtual Voice available to consumers on select titles. | Google Play Books: Supports AI narration officially. | Kobo Writing Life: Accepts AI narration, no restrictions. | Spotify/Findaway: Accepts AI narration with written disclosure. | Apple Books: Accepts AI narration, disclosure required. | ElevenLabs direct distribution: No restrictions on AI narration.
The practical effect of this landscape is that publishers opting for AI narration must make a distribution decision: pursue the open platforms and sacrifice Audible’s dominant market position, or invest in human narration for Audible while using AI for wider distribution. A hybrid model – human narration for the Audible-exclusive premium market, AI narration for global distribution and backlist – may well emerge as the industry norm.
Allow me to explore this further, because there is more to consider here.
Here’s the thing: The Audible lock creates a specific kind of investment disincentive that isn’t just about market size. It’s about promotional gravity. Audible’s dominance means it’s where the reviews accumulate, where the algorithmic recommendation engines are most developed, where listener habits are most entrenched, and where the credibility signal of being available carries the most weight.
The fragmentation tax
A publisher who produces AI-narrated audio and distributes it across Spotify, Google Play, Kobo, Apple and the independents has real reach in aggregate – but they’re doing five times the promotional work for, arguably, less than half the impact. The fragmentation tax on marketing effort is severe.
One industry observer has framed this rather amusingly thus: human narration is the hardcover limited edition; AI narration is the trade paperback. Both serve the market. Both have a place. The question is whether publishers are thinking clearly about which titles belong in which tier.
Personally I don’t rate the framing. A hard cover is a binding, The content of the hardcover and trade paperback are the same. Premium audio (Audible’s new Harry Potter, for example) is not the same as even the best AI (not yet, anyway). But as we’ll see below, the lines between human and AI production quality are blurring.
The Lean-In / Lean-Back Distinction – and Its Limits
George Walkley’s framing of AI narration suitability – “lean-in” (news, information, reference content) versus “lean-back” (entertainment, immersive fiction) – has a certain intuitive logic. AI voices, the argument goes, are perfectly adequate for content where the listener is actively processing information; they are more exposed when the listener’s attention is given over to the sensory experience of storytelling, character voice, and emotional performance.
This is a reasonable starting point, but it is not where the conversation should end. The lean-in/lean-back distinction rests on an assumption about AI’s current capabilities, not about its ceiling. And the key variable determining where that ceiling lies is not the technology per se but the quality of the data on which it has been trained.
AI narration systems have, just like AI text systems, to date, been developed overwhelmingly on content produced by news, information, and data publishers – precisely the “lean-in” content they are said to handle well. This is not coincidental. News and data publishers engaged with AI developers early and substantively, providing curated, quality-controlled audio training data. Trade publishers – fiction, narrative non-fiction, memoir, literary writing – were slower to engage. And much (not all, but much) of the trade content that has made its way into AI training sets has done so via pirated datasets, which carry no quality control and no editorial curation.
AI narration handles ‘lean-in’ content well because that is what it has been trained on. The quality of AI voice for trade fiction will rise in proportion to the quality and quantity of curated trade audio data it can access.
The conclusion that follows from this is important, and it runs counter to the conventional narrative: the relative weakness of AI narration in trade fiction is not an inherent limitation of the technology. It is a training data problem – specifically, a problem of insufficient access to high-quality, curated trade content.
The Luddite Fringe will cheer gleefully at this, but for those wanting to be still relevant this time next decade, consider that this makes trade publishers’ own audio catalogues and future audio productions extraordinarily valuable as training assets, not just as consumer products. Publishers who understand this dynamic are not simply making audiobooks – they are producing data that will shape the next generation of AI narration quality.
Research published in late 2025 found that readers actually prefer outputs from AI systems trained on copyrighted books over those from expert human writers, when evaluated blind. The implication is uncomfortable for some but commercially significant: the quality gap between AI and human narration in trade genres will close as trade publishers supply the curated data needed to close it. The question is whether they do this proactively, on their own terms and with appropriate compensation, or reactively, after the fact.
Africa, AkooBooks, and the Argument from Access
Among the most compelling presentations at Parix was that of Ama Dadson, founder and CEO of AkooBooks Audio and FBM Audio Ambassador, whose company is building an audiobook market across Africa using AI narration – in large part via ElevenLabs. The case Dadson made is not simply that AI is useful in Africa; it is that AI is market-creating in contexts where no market previously existed.
This distinction matters enormously. Much of the debate about AI in publishing takes place within the frame of established markets – the US, the UK, Western Europe – where the relevant question is whether AI might replace or diminish human production activity. In emerging markets, the question is different: can audio exist at all without AI? And in much of sub-Saharan Africa, the honest answer is that human-narrated audio production at meaningful scale is simply not viable. The infrastructure, the narrator ecosystem, and the economics do not support it.
Dadson acknowledged the limitations AI currently faces in this context: poor emotional intelligence in narration, and particularly poor performance with non-English languages and accents. Both of these limitations trace back, again, to training data. Africa accounts for over 2,000 languages and linguistic varieties. AI systems have minimal curated audio training data in most of them. The result is that AI narration in, say, Twi, Hausa, Igbo or Yoruba – let alone the hundreds of smaller languages across the continent – is not yet at commercial standard.
But this is a solvable problem, and the mechanism for solving it is precisely the one that improved AI narration for English-language content: deliberate, curated data engagement.
Publishers and producers working in African languages who wish to improve AI performance in their languages have an incentive to engage AI developers with quality audio content. The challenge is that the commercial incentive for AI developers to prioritise African language training is currently limited – which is one of the stronger arguments for development organisations, governments, and foundations (like the Germán Sánchez Ruipérez Foundation itself) to treat multilingual AI audio training as a cultural and educational infrastructure investment.
Creating Demand, Not Following It
There is a wider point here about publishing’s relationship to market development that warrants attention. No survey ever identified a latent Swedish desire for digital audiobooks before the Storytel model unlocked that desire. No focus group predicted the Norwegian market’s extraordinary per-capita audio consumption. Ghana did not wait for listener demand surveys before AkooBooks began building. The market was made, not discovered.
This is not unique to audio. No one ran a survey to see if television was wanted. No one demanded the iPad. The publishing industry has a consistent tendency to follow consumer behaviour rather than shape it – a tendency that reflects a conservative commercial culture not bad in itself, but which, in conditions of rapid technological change, becomes a competitive liability. And change does not get much more rapid than now.
And an industry, we need to understand the following point above all, as we bask in rising audiobook demand and profits: The digital audiobook market in every country where it has flourished was developed by supply-side investment and platform investment that preceded and created the demand it now serves.
AI makes this argument more accessible than ever. If producing an audiobook once required a studio, a narrator, and a production budget running into thousands of dollars, all of which conditioned publishers to be highly selective about which titles warranted audio investment – AI changes the viability threshold.
A backlist of 500 titles that was previously an audiobook project requiring a decade of investment is now, potentially, a project with a shorter timeline and a fraction of the cost. The catalogue can be made available; demand can be created from what is already on the shelf.
AI in Production: Time Saved Is Not Simply Money Saved
Erin Cox tells us of the panel on AI innovation in licensing, marketing, production and distribution – featuring John Ruhrmann of Bookwire GmbH, Ralf Biesemeier of Zebralution, Robert Holmström of Earselect, and Rickard Lundberg at Aniara – explored one of the more practically urgent questions in publishing: what does AI actually do to workflows, and what does that mean for people?
The consensus, and it is one that deserves amplification, is that time saved on routine tasks does not translate automatically into headcount reduction. Rather, it creates capacity for higher-value activity: quality checking of AI outputs, relationship building with local partners, deeper engagement with customer feedback, strategic thinking that was previously crowded out by administrative repetition. The panellists were candid that this reallocation does not happen automatically – it has to be managed.
This is a point that the publishing industry – and indeed many industries – has evidently not internalised. Multiple organisations have introduced AI tools and then expressed disappointment that the return on investment is not materialising.
In the majority of cases, the reason is not that the AI tool is inadequate; it is that the human capacity freed up by the tool has not been redirected. Staff who previously spent half their time on tasks now automated by AI are not automatically spending that newly freed time on the activities that generate the most value. They need guidance, training, and – critically -permission to work differently.
GIGO. Generic In, Generic Out
At risk of unleashing the inner maths teacher, AI quality output is a function of prompt quality input. Publishers who invest in training their people to work well with AI will see returns. Those who deploy AI and then wonder why returns are disappointing have not finished the job.
GIGO. No, not Garbage In, Garbage Out. Showing your age there! GIGO in the era of AI. Generic In. Generic Out.
There is a specific skill at stake here: prompt engineering. AI tools produce outputs that are only as good as the instructions they receive. An audio production team that understands how to direct an AI narration system – specifying tone, pacing, character differentiation, emotional emphasis – will produce substantially better results than one that treats the tool as a black box. This is not a trivial technical skill; it is the new editorial craft of the AI audio age. Publishers who invest in developing it will differentiate their AI-produced audio from competitors’ in ways that the technology alone cannot explain.
On Biesemeier’s Disappointment
Ralf Biesemeier’s reported shift in tone between the Frankfurt Book Fair in autumn 2025 and Parix in February 2026 – from hopeful to increasingly disappointed – is worth examining carefully (in mind Erin did not tell us a lot on this point). But this is representative of a pattern that is appearing across sectors.
AI tools have been marketed, sometimes irresponsibly, with capabilities that outrun their current performance. This creates a cycle: initial enthusiasm leads to deployment, deployment reveals limitations, and the resulting disappointment is attributed to the technology rather than to the mismatch between expectation and reality.
The AI was never going to solve everything; it was always a tool, not a solution. The organisations that avoided this cycle are those that deployed AI with calibrated expectations, specific use cases, and genuine investment in the human skills required to get the best from the tools.
Biesemeier’s suggestion that publishers should still familiarise themselves with the technology, even if it does not solve everything, is the right conclusion to draw. Caution and scepticism are appropriate; withdrawal is not.
The technology is genuinely improving at a pace that makes today’s limitations a poor guide to tomorrow’s capabilities. The publishers who will be best placed to exploit that improvement are those who have been learning, experimenting, and building AI literacy across their organisations throughout this period – not those who waited for the tools to become perfect before engaging.
Discovery, AI Recommendations, and the 80% Question
Virginia Huerta of Archetype presented what may have been (and certainly was for me) the most commercially startling statistic of the day: 80% of consumers use AI to inform what they purchase – books, products, and more. She urged publishers to focus on how discoverable their titles are to the AI systems increasingly mediating consumer choice.
There is an important ambiguity to unravel here. When Huerta says 80% of consumers “ask AI” for purchase recommendations, she does not necessarily mean that 80% deliberately open a chatbot and type “what should I read next?” She may mean – and this is arguably the more significant interpretation – that 80% of consumers interact with AI-mediated recommendation systems in the course of their purchase behaviour, often perhaps without knowing they are doing so.
Amazon’s recommendations engine, Spotify’s discovery algorithms, Google’s search-and-shop integration, and the personalisation systems behind every major retail platform are all, in various degrees, AI. If consumers are experiencing these as simply “the internet,” the 80% figure is a description of how mediated all commerce already is, not a measure of conscious AI adoption.
The practical implication for publishers is the same in either case: discoverability increasingly means being findable and well-represented in the training data and retrieval systems that these recommendation mechanisms use. A title that has no podcast presence, no audio sample, no structured metadata, no AI-readable summary, and no engagement on platforms that AI systems crawl is, in an important sense, invisible to a growing proportion of the purchase-decision process.
Huerta’s recommendation that publishers create podcasts to improve discoverability is a specific and actionable version of a more general principle: create audio-native, crawlable, AI-readable content that places your titles in the information environment where recommendations are generated. An author talking about their book in a podcast creates a different kind of digital presence from a press release; an audio sample from the audiobook itself is different again. These are not marketing add-ons. They are increasingly infrastructure.
97% Cannot Tell the Difference – and What That Actually Means
Julio Rojas, screenwriter and audio fiction writer, offered the day’s most striking statistic from a cultural perspective: 97% of audiences, he reported, cannot distinguish AI-generated content from human-generated content.
This figure has an echo in a separate Deezer-commissioned survey from late 2025 which found that 97% of respondents failed to correctly identify AI-generated music – and that 52% of those respondents were made uncomfortable by this finding, while 71% said they were shocked.
Two things are worth noting here. First, the 97% figure is consistent across multiple independent studies in audio domains – music, narration, advertising voice-over – which lends it greater credibility than a single data point would warrant.
Second, the discomfort and shock that accompany the finding are informative: people believe, or want to believe, that they can tell the difference. The finding that they cannot undermines a commonly held conviction about the detectability of AI, and that is uncomfortable even for people who hold no strong prior view against AI.
The “AI-slop” objection is empirically unsupported.
For the publishing industry, the significance of this is multifaceted. It means that the “AI-slop” objection – the widely held assumption that AI-generated audio will be obviously inferior and will self-evidently undermine the perceived quality of a title – is empirically unsupported. Audiences are not detecting AI narration at significant scale. The premise of the objection is flawed.
It also means that quality differentiation in AI narration is real but not guaranteed: there is a range from poor to excellent, and the gap is not always legible to the listener in real time. This places the editorial burden squarely on the publisher and producer. The quality standard must be maintained in production, not left to listener detection.
The Transparency Dilemma – and Why ‘Label First’ Is Not Without Risk
The conference’s call for transparency – publishers should disclose when audio is AI-narrated – is a reasonable default position, and there are both ethical and regulatory reasons why it may become mandatory in many markets. Consumers have expressed strong preferences for labelling: the Deezer survey found 80% of respondents want AI-made music clearly labelled; similar findings have emerged in advertising research.
But the transparency imperative sits in an uncomfortable relationship with what research consistently shows about how disclosure affects perception. A peer-reviewed study published in 2023 in the Journal of Experimental Psychology found that listeners rated the same piece of music significantly lower when told it was AI-composed than when told it was human-composed – even when the audio was identical. A 2025 study using biometric as well as self-report measures found that unfavourable responses to AI-generated music involve automatic, embodied threat reactions, not merely cognitive evaluation. The bias, in other words, is not simply intellectual; it is felt.
Disclosure shifted those ratings.
The Coca-Cola example from advertising is instructive here. The company’s 2024 AI recreation of its classic 1995 “Holidays Are Coming” advertisement was, when audiences did not know AI was involved, rated similarly to the original. Disclosure shifted those ratings. Crucially, only 1 in 4 respondents even noticed the disclosure, but among those who did, perception changed. The label does not just inform; it reframes the experience in advance, activating a set of prior beliefs about AI creativity that the work itself would not have triggered.
When listeners do not know they are hearing AI-narrated content, 97% cannot detect it. When they know in advance, ratings fall. The label is not neutral. It activates a bias that the product itself does not warrant.
This creates a genuine dilemma for publishers. They have a responsibility to be transparent; they also have an interest in the work being evaluated on its merits. These are not entirely reconcilable – in the current cultural moment. One path forward – explored by some platforms but not yet standardised – is post-experience disclosure: allowing the listener to hear the content first, then providing a clear label at the point of completion or in accompanying metadata. This respects the consumer’s right to know while not activating pre-emptive bias that the content itself does not deserve to generate.
This is not a counsel for deception. It is a recognition that disclosure framing matters – that “AI Narrated” in large text at the top of a product page is functionally different from a clear note in the credits, even if the informational content is identical. The industry needs a more nuanced conversation about how to implement transparency in ways that serve both consumer rights and the long-term development of a market that AI narration can help grow.
Narrators, Changing Roles, and the Labour Question
Any honest engagement with AI in audio publishing must address the question of human narrators. It is the question that receives the most emotive framing in public discussion, and it deserves the most careful treatment.
The panellists at Parix were consistent on one point: human oversight remains essential. This is not, as it is sometimes presented, a consolation prize – the claim that narrators will always be needed to check AI’s work. It is a substantive statement about the current division of labour that AI-assisted audio production requires.
Scripts must be assessed for AI-problematic passages (complex phonetics, tonal ambiguity, non-standard punctuation). Output must be reviewed for unnatural phrasing, incorrect stress, and emotional miscalibration. Direction must be given. Quality must be maintained.
At the same time, intellectual honesty requires acknowledging that the role of a narrator in an AI-assisted production is different from the role of a narrator in a traditional production – and that the number of narrator hours required per title, in an AI-first model, is substantially lower. This will affect employment in the narration sector. Some of the more optimistic claims that “AI won’t take narrator jobs, just change them” understate this impact.
What is more defensible is a different framing: AI narration will shrink the premium narrator market while dramatically expanding the total audio market. The net effect on narrator employment is genuinely uncertain, and will depend significantly on how fast total audio production grows. If AI enables the publication of 10 times as many audiobooks – and there is no reason in principle why it should not – the absolute number of hours requiring human narration or oversight may not fall at all. But the distribution of that work, and the economics of it, will change.
No-one promised anyone a job for life.
One last thought on this, that I’ve invoked many times and make no apology for doing so again: These narration jobs only exist because digital audio replaced analogue audio. They only exist because of streaming giants like Amazon, Apple and Spotify – and Next Generation Giants like Storytel and BookBeat – that actually created this market where no such market existed. No-one promised anyone a job for life.
No-one reading this on screen today is losing sleep over the typewriter manufacturers and offset lithograph printers, chimney sweeps, pony-express postmen and ships’ sailmakers who had to retrain for new jobs or upgrade their skill-sets to keep their existing jobs when their old careers were overtaken by technology and change.
One development worth watching in this respect is the voice licensing market. Publishers including Audible and several independents have begun exploring agreements with human narrators to license their voice characteristics for AI training and synthesis – enabling AI systems to produce narration that carries recognisable vocal qualities associated with a particular performer.
Done well, with fair remuneration, this could represent a new revenue stream for narrators rather than a displacement mechanism. Done poorly – rushed, undercompensated, and without appropriate contractual protections – it becomes extraction. The industry’s handling of this emerging market will say a great deal about its values.
Licensing to LLMs: The Control Paradox
John Ruhrmann’s section of the Parix panel addressed the increasingly urgent question of how publishers should approach licensing their content to large language model developers. The context is a wave of litigation in multiple jurisdictions: authors and publishers suing AI companies for training on copyrighted material without licence or compensation.
An audience question raised a specific concern: how can an author licence their work to an LLM developer while preventing it from being used to generate hate speech or other harmful content? Ruhrmann’s answer was both practically accurate and philosophically interesting.
Contractual language can address permitted and prohibited uses; but once content is in a training dataset, the control that any contract provides is limited. The model will generate outputs reflecting patterns learned across an enormous corpus, not any individual title. The analogy to free speech was apt: even without AI, authors cannot fully control what readers do with their work, how it is interpreted, or what uses it is put to downstream.
This does not mean contracts are worthless in this context – they can address egregious cases and provide grounds for legal remedy. But it does mean that publishers approaching LLM licensing need to be realistic about what contractual language can achieve. The more important protection is reputational and relational: working with AI developers who are demonstrably committed to responsible use, who have governance structures in place, and whose commercial incentives align with treating publisher content as a quality input rather than raw material.
Publishers who engage proactively with AI developers have leverage.
There is a strategic dimension here that the industry really needs to think about:
Publishers who engage proactively with AI developers – on fair terms, with proper compensation, and with the kind of curated, quality-controlled content that AI systems need to improve – have leverage. They are providing something genuinely valuable.
Publishers who wait to be approached, or who litigate without offering any path to legitimate licensing, may win cases while losing the longer-term negotiation. The AI developers will train on something. The question is whether that something includes quality trade content on publisher-favourable terms, or data scraped from uncontrolled sources and copyrighted material legitimately purchased at cost price, which in the US is currently legal under two court rulings last year.
Legal but cumbersome. Look at how one AI company bought gazillions of printed books and tore the covers off them to scan the pages. Legal, but not efficient.
A generation of new start-ups are now positioning themselves as middlemen in this process, able to deliver curated publisher catalogues to AI companies in the format they need, at a a price. A price most AI companies will willingly pay, given the chance.
The Road Ahead: A Call for Strategic Engagement
The Parix Audio Day Erin Cox revealed (I’m so loving the new look Publishing Perspectives!) offered something relatively rare in professional publishing conferences: a genuinely calibrated view of AI – neither the breathless enthusiasm that has characterised some early tech-optimist discourse, nor the entrenched scepticism that has led some in the industry to treat AI as a threat to be managed rather than a capability to be developed.
The message from Madrid was essentially: this is complex, it is real, it is growing, and it requires active engagement rather than watchful waiting.
What does active engagement look like in practice?
- It looks like this: publishers investing in AI audio literacy across their teams, not just their technology departments.
- It looks like genuine experimentation with AI narration on appropriate titles, with honest assessment of quality and market response.
- It looks like proactive engagement with AI developers on training data licensing – not just as a defensive measure against having content used without permission, but as an affirmative commercial strategy.
- It looks like treating audio not as a rights exploitation exercise but as a market development opportunity, and using AI’s cost reduction to publish audio far more widely than current economics permit.
It also looks like honesty about what AI cannot yet do well – and a clear-eyed understanding of why. The gaps in AI’s performance in non-English languages, in emotionally complex fiction, in long-form immersive narrative, are not mysterious. They reflect the contours of the training data available to AI developers. Publishers who understand this can contribute to closing those gaps in ways that benefit them commercially and culturally.
The audiobook market is growing at double-digit rates. The audience for audio is expanding. The barriers to audio production are falling. The discovery channels through which readers find new titles are increasingly AI-mediated.
The publishers who will shape what this market looks like in five years are not the ones waiting for the landscape to clarify. They are the ones, like the speakers gathered in Madrid, already experimenting, already failing informatively, and already building the understanding that strategic advantage in this market will require.
This post first appeared in the TNPS LinkedIn newsletter.