How I Built a Fully Autonomous AI Podcast

The Concept

Luke at the Roost is a call-in radio show where every caller is AI-generated. There’s no host reading scripts. No manual editing. No human in the loop at all. The entire pipeline — from script generation to final published episode — runs autonomously.

The idea started as an experiment: could I build a system that produces a genuinely entertaining podcast without any manual intervention? Twenty-eight episodes later, the answer is yes.

The LLM Pipeline

Each episode starts with a batch of caller prompts. Every caller gets a unique personality profile — age, background, speaking style, emotional state, and a topic they’re calling about. Some callers are regulars. Some are first-timers. Some are unhinged.

The system routes requests through multiple LLM providers via OpenRouter, balancing cost, latency, and model capability. Complex character interactions use more capable models. Simple segues use faster, cheaper ones. Each caller’s dialogue is generated independently, then the host responses are generated contextually, creating natural back-and-forth that sounds like a real conversation.

Prompt engineering is the hardest part. Getting LLMs to produce dialogue that sounds natural when spoken (not written) took dozens of iterations. Written English and spoken English are fundamentally different, and most LLM output sounds like a blog post, not a person talking.

Voice Synthesis

Every character gets a distinct ElevenLabs voice. The host has a consistent voice across all episodes. Callers rotate through a pool of voice profiles matched to their personality — age, gender, accent, energy level. The system maps character attributes to voice parameters automatically.

The TTS pipeline handles pacing, emphasis, and emotional tone through SSML-like controls. A frustrated caller sounds frustrated. A nervous first-time caller sounds nervous. This is where the podcast stops feeling like a tech demo and starts feeling like actual content.

Automated Audio Production

This is where it gets interesting. Raw TTS output sounds terrible as a podcast. It needs post-production — and that post-production is fully automated inside REAPER DAW.

Custom Lua scripts handle the entire production chain: silence removal and trimming, loudness normalization to podcast standards (-16 LUFS), music bed insertion and volume ducking, ad and station ident placement at natural break points, crossfades between segments, and final export.

The scripts analyze audio content to find natural pause points for ad insertion rather than cutting at fixed intervals. The result is a produced episode that sounds like someone sat in a DAW for hours. Nobody did.

Multi-Platform Publishing

Once the episode is rendered, automated distribution takes over. The episode publishes to Castopod (self-hosted) which handles the RSS feed, Spotify, and Apple Podcasts distribution. Simultaneously, Postiz (also self-hosted) pushes promotional clips and episode announcements to 10+ social networks.

Both Castopod and Postiz run on my QNAP NAS alongside 27 other services. The entire publishing pipeline is a single API call chain triggered when the final audio file lands in the output directory.

Monitoring

Every service in the pipeline is tracked by Uptime Kuma. If ElevenLabs is down, the system queues and retries. If Castopod fails to publish, I get an alert. A custom stats dashboard tracks per-episode metrics: generation time, cost per episode, listener counts, and platform-specific analytics.

The Result

Twenty-eight episodes and counting. Each episode takes under 10 minutes of compute time from first LLM call to published episode. Total human intervention: zero. Cost per episode is a fraction of what traditional podcast production runs.

The system isn’t perfect — occasionally an LLM generates something that doesn’t flow well, or a voice synthesis clip has odd pacing. But the hit rate is high enough that the podcast has genuine listeners who don’t know (or care) that it’s fully autonomous.

The real takeaway isn’t that AI can make podcasts. It’s that with enough automation engineering, you can build production pipelines that handle creative work end-to-end. The LLM is just one component. The orchestration is everything.

Check it out at lukeattheroost.com.