How to use Synthesia voice cloning and personal avatar to scale your real face

Personal Avatar + voice clone lets you appear in 50 videos a week without recording any of them. Done well, it is your real face and voice at 20x scale. Done poorly, it is uncanny.

60 minAdvancedUpdated May 26, 2026

Who this is forFounders who are the face of the brand. Executives doing high-volume internal comms. Course creators who want to scale appearances without burnout. Anyone whose personal brand is a meaningful business asset.

What you'll need

Synthesia Enterprise plan (Personal Avatar requires it)
Studio-quality recording space for the initial training session
About 30-45 minutes for the training shoot + 1-2 weeks for Synthesia to process
Comfort with the ethical implications of an AI version of yourself

Step 1

Decide if Personal Avatar is right for you

Personal Avatar is high-leverage if your face/voice is a brand asset. It is overkill if you mostly want stock-avatar marketing video.

Yes: founder-led brands where your face is on the website, your voice is in podcasts, your appearances drive trust.

Yes: high-volume async comms where teammates want to see you (CEO weekly updates, training, sales).

Probably not: marketing teams where any believable presenter would work fine.

Probably not: teams without enterprise-plan budget — the cost is significant.

Step 2

Prepare for the training recording session

Synthesia processes a 15-20 minute recorded session. Quality of the session = quality of the avatar for the next 6-12 months.

Studio space: well-lit, neutral background, no echo. Synthesia provides a guide.

Wardrobe: solid color top, mid-tone, no busy patterns. Same wardrobe you would wear in marketing videos.

Hair and makeup: how you want to appear for the next year. Avatar locks in this look.

Mic: USB-C or XLR studio mic. Not laptop built-in. Voice quality compounds.

Script: Synthesia provides a 15-20 min training script. Read at natural pace, natural emotion.

Step 3

Train your voice clone separately

Voice Lab requires a separate 10-15 minute audio recording. Same quality bar as the avatar shoot.

Synthesia → Voice Lab → Create voice.

Recording requirements: 10-15 minutes of varied speech. Read a provided script that covers different tones and word patterns.

Audio quality: studio mic, quiet room, no background noise. Single take preferred.

Synthesia processes the voice in 48-72 hours. Test with 3-5 sample scripts after.

Voice clone is separate from Personal Avatar — you can pair them, or pair your voice with a stock avatar, or vice versa.

Step 4

Pair voice and avatar in the editor

In the video editor, select Personal Avatar + cloned voice for full "you" video. Test thoroughly before production use.

Editor → Avatar dropdown → Personal Avatar.

Voice dropdown → your cloned voice.

Generate a 30-second test script. Watch closely.

Check: mouth shapes match your real ones. Voice rhythm matches your real speech. Gestures (if your avatar uses them) feel natural.

Iterate on script phrasing — some sentence structures expose uncanny moments. Avoid those.

Step 5

Build the "what scripts work" library

Some script patterns work great; others trigger uncanny moments. Document what works for your specific avatar.

Production workflow: every video, note any uncanny moments by timestamp.

Patterns that often work: short sentences, common words, conversational rhythm.

Patterns that often fail: long technical sentences, very emphatic delivery, words with unusual phoneme combinations.

Build a "preferred phrasing" guide for your team. Over months, the library lets you write scripts that play perfectly.

Step 6

Disclose appropriately

Ethics matter. Be clear when the video is AI-generated versus live-recorded.

Internal use (training, async comms): a footer disclosure is enough. "This video uses my AI avatar."

External marketing: include disclosure prominently. Modern audiences can detect AI avatars; transparency builds trust.

Sales outreach: optional disclosure. Most prospects do not mind; some appreciate the transparency.

Document your disclosure policy as part of brand guidelines. Consistency matters more than the specific choice.

Common mistakes

What goes wrong (and how to avoid it)

Wardrobe drift between training and production
What goes wrong: Avatar wears a blue shirt; you in real life now wear different clothes. When real-you and avatar-you appear in the same campaign, the disconnect is obvious.
How to avoid: Wear the same wardrobe style in real-recorded video that your avatar wears. Or maintain visual consistency in some other way (background, framing).
Skipping the voice clone test phase
What goes wrong: First production video has 3 uncanny moments. Ships anyway because deadline. Viewers notice and trust drops.
How to avoid: 3-5 test scripts on the cloned voice before production use. Identify uncanny patterns and avoid them in scripts.
Using Personal Avatar where stock avatar would work
What goes wrong: Personal Avatar burns Enterprise plan budget on use cases where a stock avatar would have been fine. ROI looks bad and you re-evaluate the whole tool.
How to avoid: Reserve Personal Avatar for high-leverage use: founder-led marketing, executive comms, courses. Use stock avatars for routine marketing.
No ethical disclosure policy
What goes wrong: Audience or team discovers your AI avatar mid-campaign. Trust erodes. Damage takes weeks to repair.
How to avoid: Document and follow a disclosure policy. Transparency now beats discovery later.
Never re-training the avatar
What goes wrong: 12 months later, your real look has evolved (haircut, age, glasses). Avatar still looks like 12-months-ago you. Disconnect compounds.
How to avoid: Annual re-training. Synthesia processes the new session and replaces the old avatar. Voice clone too if it has drifted.

Recap

What to take away

Personal Avatar makes sense when your face is a brand asset.
Training session quality = avatar quality for 6-12 months.
Voice clone is separate; train both for full "you" video.
Document which script patterns work and which trigger uncanny.
Disclose AI avatar use. Re-train annually.

Done — what's next

How to create your first Synthesia video with an AI avatar

Read the next tutorial

Hand it off

Personal Avatar production is high-stakes and specialty work. EverestX video specialists familiar with Synthesia Personal Avatar can produce on your face/voice in 60-90 min per video, typically $200-400 per piece for ongoing engagements.

See specialist rates

Frequently Asked Questions

How much does Personal Avatar cost?

Enterprise plan starts ~$1,000+/mo and includes Personal Avatar setup as part of the package. One-time setup adds production cost. Most enterprise contracts land $1,500-5,000/mo all-in.

Can my Personal Avatar speak languages I do not speak?

Yes — Synthesia can have your avatar speak any of 140+ languages while keeping your voice characteristics (with some loss of pronunciation accuracy). Useful for global founders.

Is the AI version legally my likeness?

Yes — you retain ownership of your Personal Avatar. Synthesia's ToS protects you. Document the policy with your legal team for high-stakes uses.

What happens to my avatar if I leave the company?

Depends on contract. If your name and likeness are personal IP, you keep them. If the contract assigns likeness to the company, that needs careful negotiation up front. Get legal review before initial training.

How to use Synthesia voice cloning and personal avatar to scale your real face

Decide if Personal Avatar is right for you

Prepare for the training recording session

Train your voice clone separately

Pair voice and avatar in the editor

Build the "what scripts work" library

Disclose appropriately

What goes wrong (and how to avoid it)

What to take away

Frequently Asked Questions

Related tutorials

How to create your first Synthesia video with an AI avatar

How to use Synthesia for sales outreach videos that get replies

When to hire an AI video specialist who knows Synthesia

How to use Synthesia voice cloning and personal avatar to scale your real face

Decide if Personal Avatar is right for you

Prepare for the training recording session

Train your voice clone separately

Pair voice and avatar in the editor

Build the "what scripts work" library

Disclose appropriately

What goes wrong (and how to avoid it)

What to take away

Frequently Asked Questions

Related tutorials

How to create your first Synthesia video with an AI avatar

How to use Synthesia for sales outreach videos that get replies

When to hire an AI video specialist who knows Synthesia