How to produce a podcast end-to-end in Descript

Most podcasters bounce between 4 tools: Riverside for recording, Descript for editing, Audacity for cleanup, Canva for show art. Descript can do 80% of it in one app. Here's the full episode workflow.

~5-7 hrIntermediateUpdated May 26, 2026

Who this is forPodcasters publishing weekly to monthly. Solo show hosts, interview shows, or co-hosted formats. If you're spending 8-12 hours per episode in post, this workflow cuts it to 3-5.

What you'll need

Descript Pro account configured
Recording setup: USB or XLR mic, quiet room, headphones (no speakers)
For remote interviews: Riverside, SquadCast, Zencastr, or Zoom with local recording
Show art, intro/outro music (royalty-free or licensed)
About 4-6 hours per episode end-to-end after you learn the workflow

Step 1

Pre-production: prep, recording setup, and templates

Build episode template in Descript Library → Templates. Includes intro/outro placeholders, music beds, lower-thirds. Clone per episode.

Library → 05 — Templates folder → + New Project → "Podcast Episode Template."

Add at the start: intro music (15-20 sec), a placeholder text "Welcome to [show name] — today we're talking about [topic]" for the host's intro voiceover, then a placeholder for guest intro.

Add at the end: 'Thanks for listening — links in show notes' outro voiceover placeholder, outro music (15-20 sec).

Add brand lower-third for video version: name + episode number, host name, guest name. Position bottom-left, appears for first 5 sec of each speaker change.

Set color palette: workspace-defined brand colors. Lower-thirds, title cards, transitions all pull from this.

Save. Every new episode = right-click template → Duplicate. Renames, ready to import audio.

Step 2

Recording setup for interviews + co-hosts

Solo: record directly in Descript. Interviews: use Riverside/SquadCast/Zoom local-only recording. Always get each speaker on their own track.

Solo episodes: Descript → New Project → Record → Audio Only. Mic check (peak levels around -6 to -3 dB). Record.

Remote interviews: use a tool with separate local tracks (Riverside, SquadCast, Zencastr). Avoid recording just from Zoom — Zoom's single combined track is unfixable in post when one speaker has bad audio.

After remote recording, download each speaker's separate track. Import all tracks into Descript as a multi-track project.

In Descript: View → Tracks → assign each speaker. Descript can also auto-detect speakers if recorded as one file, but accuracy is 60-75% — manual track-per-speaker is reliable.

In-person multi-mic recording: each mic into its own channel. Same multi-track import workflow.

Always record + save backup audio locally on each speaker's machine. Network drops happen. Re-recording due to lost audio is the worst.

Step 3

Auto-transcribe + initial cleanup pass

Let Descript auto-transcribe (3-5 min for 60-min episode). Run Filler Word Removal first. Quick read-through to catch obvious issues.

After import, transcript generates automatically. Each speaker shows in a different color in the script panel.

Tools → Filler Word Removal → preview detections. Selectively approve (keep some 'um' for natural cadence, remove the egregious ones).

Quick read-through: scan the transcript at 1.5× speed. Flag major issues: long tangents, dead air, technical glitches, anything that needs an editorial cut.

Don't try to edit on this pass — just mark with tags or strikethrough. Editorial decisions happen in pass 2.

Time check: cleanup pass on a 60-min raw interview should take 15-25 minutes. If it's taking longer, the source recording has bigger issues that need addressing before continuing.

Step 4

Editorial edit: cuts, rearrangements, structure

Read transcript end-to-end. Delete tangents, off-topic sections, and weak quotes. Rearrange order for narrative flow. Target 20-30% time reduction.

Read the transcript completely. Mark editorial decisions:

Cut: long tangents that don't serve the episode's promise. 4-minute side-stories about something irrelevant get deleted.

Tighten: when a guest takes 2 minutes to make a point that could land in 30 seconds, cut the buildup.

Rearrange: drag transcript sections to reorder for narrative flow. Sometimes the best quote in minute 47 should open the episode.

Strengthen: if a key topic was discussed briefly, leave space for the host to add a short voiceover bridge during post-production (record separately).

Cold open: pick the most compelling 30-90 seconds and move it to the start, before the intro music. Hooks listeners in the first 10 seconds.

Target reduction: 60-min raw interview → 40-50 min final episode. 90-min raw → 55-70 min final. Anything longer than 60 min final = niche audiences only; mass audiences drop off.

Step 5

Mix audio: Studio Sound, EQ, music ducking, leveling

Apply Studio Sound only to noisy voice tracks. Set music ducking (-12 to -18 dB under voice). Normalize peaks to -3 dB. Master loudness to -16 LUFS.

Audio mixing is where amateur and professional podcasts diverge.

Studio Sound: apply to any voice track that sounds noisy/echoey. Compare before/after; don't apply if the source is already clean.

EQ: most voice tracks benefit from a high-pass filter at 80 Hz (removes rumble) and a slight cut at 200-300 Hz (removes 'boxy' resonance). Descript's effects panel handles both.

Compression: light compression evens out volume between quiet and loud moments. Descript applies this automatically in Studio Sound; manual compression on top is usually unnecessary.

Music ducking: when intro/outro music plays under voice, music should sit -12 to -18 dB below voice. Descript auto-ducks; verify per track.

Loudness normalization: target -16 LUFS for podcast distribution standards (Apple Podcasts spec). Descript shows LUFS in the export panel — adjust master gain to hit -16.

Peak normalization: peaks should not exceed -3 dB. Descript flags clipping; fix with gain reduction on the offending track.

Step 6

Visuals + chapters for video version

For YouTube/video versions: lower-thirds per speaker, brand color overlays, chapter markers, b-roll on key moments, ad reads if applicable.

Most podcasts now also publish a video version (talking heads + lower-thirds) to YouTube. Descript handles this in the same project.

Lower-thirds: pre-built in your template. Verify each speaker change triggers a fresh lower-third with name + role.

Chapters: Tools → Chapter Markers. Add at major topic shifts (typically 5-8 per episode). YouTube uses these for chapters in the player.

B-roll: add image/screen-recording overlays at the moments where a speaker references something visual. Reduces fatigue of pure talking-head viewing.

Brand stinger/intro card: 3-5 sec animated brand intro at the very start. Built once in the template, reused per episode.

Ad reads: if you do mid-roll ads, insert at natural break points. Use a music transition (1-2 sec stinger) into and out of ad reads.

Step 7

Export, distribute, and extract socials

Export MP3 for podcast feed, MP4 for YouTube, separate clip files for socials. Publish via Descript or upload manually. Schedule socials.

Export → Audio (MP3, 128 kbps stereo) for podcast hosts (Buzzsprout, Anchor, Libsyn, Transistor).

Export → Video (MP4, 1080p H.264) for YouTube.

Extract socials: jump to your tagged #clip sections from the edit pass. For each, select the 30-90 sec section → File → Export Selection → MP4 with auto-subtitles (Descript can generate subtitles in the export).

Clips for vertical formats (Reels, TikTok, Shorts): Descript can crop 16:9 to 9:16 with auto speaker-tracking. Tools → Aspect Ratio → 9:16. Each clip becomes a vertical short.

Show notes: open the transcript → File → Export Transcript → Markdown. Edit for show notes (links, key timestamps, guest bio).

Publish: Descript can publish directly to Buzzsprout, YouTube, Spotify via integrations. Saves the export-and-upload step.

Schedule socials in Buffer, Later, or Hootsuite. Stagger clip releases over the week the episode drops.

Common mistakes

What goes wrong (and how to avoid it)

Recording remote interviews on Zoom only
What goes wrong: Zoom records a single combined track. One bad mic ruins both speakers. Post-production can't separate them. You either ship low-quality or re-record entirely.
How to avoid: Use Riverside, SquadCast, or Zencastr for remote interviews. Each speaker gets their own local high-quality track. Worst case, even Zoom with "Record each participant separately" (paid feature) is better than nothing.
Not building episode templates
What goes wrong: Each episode starts from scratch. Intro music, brand colors, lower-thirds, music ducking all get re-built every time. 1-2 hours wasted per episode.
How to avoid: Build the episode template once in 05 — Templates folder. Clone per episode. Saves 1-2 hours every time.
Skipping the editorial pass
What goes wrong: 60-min episodes drag through 4-min tangents. Listener drop-off at minute 12. Reviews complain about pacing. Subscribers stop growing.
How to avoid: Mandatory editorial pass: cut to 20-30% shorter than raw. Strongest opens, tightest middles, shortest endings. Listen at 1.5× to verify pace.
No loudness normalization
What goes wrong: Episodes get auto-adjusted by distribution platforms. Sometimes too quiet, sometimes too loud. Listener fatigue. Compared to professional podcasts you sound amateur.
How to avoid: Normalize to -16 LUFS before every export. Descript shows the meter; adjust master gain to hit target. Industry standard, non-negotiable.
Releasing without clips
What goes wrong: Podcast episode drops to crickets on social. Discoverability suffers. New listener acquisition stalls at organic-search-only.
How to avoid: Extract 3-5 clips per episode (30-90 sec). Vertical format for Reels/TikTok/Shorts. Schedule the week of release. Drives 30-50% of new listener growth.
Not exporting show notes from transcript
What goes wrong: Show notes get written from scratch in Google Docs. Takes 30-45 min per episode. Inaccurate quotes. Timestamps off.
How to avoid: File → Export Transcript → Markdown. Edit transcript into show notes in 10-15 min. Timestamps are accurate. Quotes are verbatim.

Recap

What to take away

Pre-build episode templates. Clone per episode, save 1-2 hr each time.
Multi-track recording (Riverside/SquadCast for remote) is non-negotiable.
Auto-transcribe, then Filler Word Removal, then editorial pass for cuts.
Mix audio: Studio Sound on noisy tracks only. Normalize to -16 LUFS.
Add lower-thirds, chapters, b-roll for the video version.
Export socials and show notes from the same project.

Done — what's next

How to set up a Descript account for podcast + video editing

Read the next tutorial

Hand it off

Producing weekly podcasts solo takes 8-12 hours/episode in DIY mode, 4-6 hours with the Descript workflow. Hiring a vetted podcast video editor gets it down to 1-2 hours of host time (recording + light review) at $14-16/hr — typically $1,000-2,000/mo for a weekly podcast with video + clips + show notes.

See podcast editor rates

Frequently Asked Questions

How long should I budget per episode in Descript?

With the workflow above: 4-6 hours per 60-min episode (excluding recording). Breakdown: 30 min cleanup, 60 min editorial edit, 60 min mix, 60 min visuals + chapters, 30 min export + socials + show notes. Faster with templates + practice. Slower the first 5-10 episodes.

Can I record the podcast directly in Descript or do I need Riverside?

For solo episodes: Descript's built-in recorder is fine — same audio quality as any other tool when paired with a good mic. For remote interviews with 2+ speakers: use Riverside, SquadCast, or Zencastr. They record each speaker locally on their own machine then upload separate tracks — Descript can't do that for remote guests.

How do I make my podcast sound like NPR / professional production?

Four things: (1) good mic + treated room (matters more than tools), (2) multi-track recording — never one combined track, (3) loudness normalization to -16 LUFS, (4) editorial discipline — cut 20-30% of the raw recording. Pro production isn't 'better tools,' it's relentless craft.

How do I extract short clips for social media efficiently?

Tag #clip during your editorial pass on highlight quotes. After publish, jump to each tag → Export Selection → MP4 with subtitles. For vertical formats (Reels/TikTok/Shorts), use Tools → Aspect Ratio → 9:16 with auto speaker-tracking. 3-5 clips per episode in 30-45 min.

Does Descript publish directly to podcast hosts?

Yes — integrations exist for Buzzsprout, Anchor (Spotify for Podcasters), Transistor, and Captivate. Connect once in Settings → Publishing. Then in your project, click Publish → choose host → uploads MP3 + metadata. Saves the export-then-upload step. For non-integrated hosts (Libsyn, Podbean), export and manually upload.

How to produce a podcast end-to-end in Descript

Pre-production: prep, recording setup, and templates

Recording setup for interviews + co-hosts

Auto-transcribe + initial cleanup pass

Editorial edit: cuts, rearrangements, structure

Mix audio: Studio Sound, EQ, music ducking, leveling

Visuals + chapters for video version

Export, distribute, and extract socials

What goes wrong (and how to avoid it)

What to take away

Frequently Asked Questions

Related tutorials

How to set up a Descript account for podcast + video editing

How to use Descript's transcript-based editing workflow

How to set up Descript publishing and team collaboration

When to hire a podcast / video editor for Descript workflows

How to set up Loom team libraries that scale

How to produce a podcast end-to-end in Descript

Pre-production: prep, recording setup, and templates

Recording setup for interviews + co-hosts

Auto-transcribe + initial cleanup pass

Editorial edit: cuts, rearrangements, structure

Mix audio: Studio Sound, EQ, music ducking, leveling

Visuals + chapters for video version

Export, distribute, and extract socials

What goes wrong (and how to avoid it)

What to take away

Frequently Asked Questions

Related tutorials

How to set up a Descript account for podcast + video editing

How to use Descript's transcript-based editing workflow

How to set up Descript publishing and team collaboration

When to hire a podcast / video editor for Descript workflows

How to set up Loom team libraries that scale