For LMFT, LCSW, LPC, addiction counselors, psychologists

The therapist's quiet upgrade: read less, listen more

A pragmatic playbook for clinicians who want to cut documentation time, build accessible client handouts, and listen to research without setting another evening on fire. Three workflows mapped to the right voice and the right HIPAA boundary. No IT approval, no enterprise sales call. Free to try right now. No PHI ever leaves your screen.

Open the studio (free)See PRO and Creator pricing

PRO: $19/mo. Creator: $39/mo (commercial license). Lifetime: $199/$349 once.

Hear it for yourself: same body-scan line, three reads

The sound of a worksheet matters. Below: the same opening grounding line, narrated three different ways. Press play on each to hear the difference.

Free

Standard read

Aria standard voice, default cadence

PRO

Empathetic style

Aria with the empathetic expressive style

PRO

Real whispering

Aria with the whispering expressive style

Same Aria voice on all three. The free tier delivers her standard cadence. PRO ($19/mo) unlocks 95 expressive emotional styles on the same voices, including whispering, empathetic, gentle, calm, cheerful, and newscast.

See PRO pricing →

Burnout is the default. This is one fix.

Why therapists are converting clinical materials to audio

Therapists use text-to-speech for three primary tasks: proofreading de-identified clinical notes, listening to journal articles during commutes, and producing accessible audio handouts for clients.

Roughly 61% of psychologists report at least one symptom of burnout, with administrative tasks among the leading contributors (American Psychological Association, 2022 Practitioner Pulse). The Mayo Clinic Proceedings burnout work shows similar numbers across the broader clinical workforce. Most of that admin burden is not direct clinical care. It is documentation, client material prep, and trying to keep up with a literature that produces more than a million new biomedical citations a year through PubMed alone.

Audio is the part of the workday clinicians chronically underuse. Commute time. Treadmill time. Walking the dog. Folding laundry. None of those windows work for reading, but all of them work for listening. If you can shift even 30 minutes of weekly journal reading and 30 minutes of note proofreading into audio that runs while you do something else, that is roughly an hour a week back. Over a year, that is a working week.

And then there is the client side. Adults with low literacy benefit measurably from audio health information per the HHS Healthy People 2030 framework. Therapy worksheets in audio format see meaningfully higher between-session homework completion, especially among clients with ADHD, dyslexia, or simple auditory-learning preference. The honest version: clients do not always read the PDF you handed them on Tuesday. They almost always listen to a 4-minute audio clip on the bus.

The line everyone needs to know

The HIPAA boundary: what is safe to convert and what is not

Text-to-speech is HIPAA-neutral; HIPAA risk depends entirely on whether the input text contains protected health information.

Here is the key point. FreeTTS, like every consumer TTS tool, is not a HIPAA covered entity by itself. We do not sign Business Associate Agreements at the consumer tier. That is fine for most therapy uses, because most useful conversions are not PHI. The honest framing is: the tool is HIPAA-neutral, and the risk lives entirely in what you choose to paste in.

Generally safe to convert: psychoeducation handouts, generic CBT and DBT worksheets, mindfulness and grounding scripts, exposure-hierarchy templates, journal articles and book chapters you have a license to consume, your own conference notes, regulatory updates, ethics CEU material, your own dictated narration of a workshop you are building.

Not safe to convert (or only after stripping identifiers): session notes that name a client, intake forms with DOB or address, treatment plans that reference specific people, anything with MRNs or insurance information, voicemails or messages from clients, anything from your EHR that you have not personally cleaned of identifiers first.

The pattern works because most of what therapists actually want to convert is generic content: psychoeducation, scripts, articles, materials you would happily hand a stranger in a workshop. Use the checker below to spot the obvious patterns. The final eyeball pass on names and locations is still on you. Tools help. They do not replace clinical judgment.

HIPAA-safe pre-flight check

Paste a worksheet, handout, or note. Runs locally in your browser, never sent anywhere. Catches common PHI patterns so you can strip them before pasting into any TTS tool.

Three workflows that actually save time

Where text-to-speech fits in a therapist's week

In our conversations with clinicians, the same three patterns come up over and over. Different specialties, different settings, similar workflows.

📝

Proofread de-identified notes

Strip names and dates. Paste into FreeTTS. Listen at 1.2x while you eat. Catches awkward phrasing and missing sections that your eyes skim past on the third reread.

🎧

Listen to journal articles

PubMed, your specialty's flagship journal, the chapter your supervisor sent. Convert to MP3, listen during commute or run. Saves the evening reading slot you never had energy for anyway.

🌿

Build accessible client handouts

Convert a generic CBT thought record, a body-scan script, or an exposure-hierarchy walkthrough into client-facing audio. Drop into the secure portal. Higher homework completion, especially with neurodiverse clients.

Match voice to content type

Voice selection for therapeutic content

For guided relaxation scripts, therapists choose warm female neural voices at 0.85x speed paired with SSML pause tags.

The voice you pick is half the experience. For relaxation and grounding work, you want warm and slow. For psychoeducation and journal listening, you want clear and neutral. For dictation playback, you want fast and accurate. The settings table below is a starting point. Most clinicians end up with two or three saved presets they use across all content.

Content type	Voice	Speed	Notes
Guided relaxation, body scan	Ava or Jenny	0.85 to 0.9	SSML break tags between cues for silence-driven regulation
Psychoeducation handout	Andrew or Jenny	0.95 to 1.0	Calm, neutral, slightly serious cadence works best
CBT thought record narration	Ava or Andrew	1.0	Clients pause and reflect between prompts naturally
Journal article listening	Andrew	1.2 to 1.5	Faster speeds work for content you already have context on
Note proofread	Any clear voice	1.1 to 1.3	Voice quality matters less, speed matters more

Step by step

CBT worksheet to client audio in three minutes

The actual workflow. Tested with real clinicians. The first time it takes maybe five minutes because you are getting comfortable with the steps. After that it is genuinely a two to three minute task.

Strip identifiers
Open the worksheet or note. Replace any client name with 'the client.' Delete the date. Remove DOB, MRN, phone, address, and anything else that ties content to a specific person. Run the HIPAA-safe checker above if you want a second pass.
Paste into the studio
Open freetts.org/text-to-speech in another tab. Paste the de-identified text. Pick a voice (Ava or Jenny work well for warm narration). Drop the speed to 0.85x for relaxation content or keep it at 1.0x for psychoeducation.
Add SSML pauses if needed
For grounding scripts and guided imagery, insert pause tags between cues so silence does the regulation work. Example: '... breathe in. <break time="800ms"/> breathe out.' SSML break and phoneme tags pass through to the engine on PRO and Creator tiers.
Generate and download MP3
Click Generate. Wait a few seconds. Click Download. The audio file is yours, no watermark on PRO, ready to host on your client portal or send via secure share. The source text is deleted server-side after generation.
Distribute through a HIPAA-aware channel
Drop the MP3 into your client portal (SimplePractice, TheraNest, Hushmail) or password-protected practice site. Do not email audio directly to a client unless you are using a HIPAA-compliant email service. Generic therapeutic audio is low risk when delivered properly.

Meet clients where they are

Serving neurodiverse clients with audio handouts

Audio handouts increase between-session homework completion among clients with ADHD, dyslexia, and auditory-learning preferences.

Roughly 15 to 20 percent of the U.S. population has a language-based learning disability such as dyslexia (National Center for Learning Disabilities, 2020). ADHD prevalence in adults sits around 4 to 5 percent. Add anxiety disorders that make sustained reading cognitively expensive, and the share of your caseload who would do better with audio over text is substantial. Often higher than people guess.

Three things change when you offer audio versions of worksheets:

Completion goes up. Research on therapy homework consistently finds substantial non-completion rates. The Cognitive and Behavioral Practice work suggests roughly 30 to 50 percent of assignments go undone, with methodology varying study to study. Clinicians offering audio versions report meaningful improvements, especially among clients with auditory-learning preferences.

The therapeutic alliance gets reinforced. Voice familiarity strengthens the therapeutic alliance, which is itself a known mediator of treatment outcomes (Norcross and Lambert, 2018). Clients hearing a consistent voice between sessions stay more anchored. If you are using voice cloning on the Creator tier with your own voice, the effect is more pronounced.

Cultural-competency surface area widens. Offering content in multiple formats demonstrates that you are meeting clients where they are. For clients whose first language is not English, switching the voice to a multilingual neural voice reading translated text is a small adjustment that lands big.

Different tools, different jobs

FreeTTS compared to Otter, Speechify, Descript

These four tools are often confused as alternatives. They are not. Each does a different job. Picking the right one for your week saves real money.

Tool	What it does	Therapist use case	Price (verify before buying)
FreeTTS	Text into audio. 400+ voices, SSML support on PRO.	Audio handouts, journal listening, note proofreading.	Free, $19/mo PRO, $39/mo Creator
Otter.ai	Transcription. Audio into text.	Recording sessions to transcribe. HIPAA on Enterprise only.	$10 to $20/user/mo, Enterprise custom
Speechify	Text into audio. Consumer-focused.	Personal article listening. Heavier UI for casual use.	~$139/yr (verify, prices shift)
Descript	Audio editing with voice cloning.	Recording and editing podcast-style content. Overkill for handouts.	$15 to $30/mo

The short version. If you want to dictate notes by speaking, you need transcription (Otter, or your EHR's built-in dictation if it has one with HIPAA covered). If you want audio from text, you need TTS (FreeTTS, Speechify, NaturalReader). If you are producing recorded audio content with editing, you need a podcast tool (Descript). Many clinicians use a combination: Otter for sessions, FreeTTS for handouts and journals.

Pick the tier that matches your practice

Pricing for solo practice vs group practice

The pricing decision is genuinely simple. Solo clinician using audio for your own caseload: PRO. Group practice or selling content as a course or workshop: Creator (commercial license matters). On the fence: try free, decide later.

Free

per month, no card

5,000 chars/month
1,000 chars per generation
400+ voices
Watermark on output

Try it now

PRO

$19/mo

solo practice

1M chars/month
10,000 chars per generation
HD voices, no watermark
SSML phoneme + break tags
Voice cloning included

Get PRO

Creator

$39/mo

group, workshops, courses

5M chars/month
25,000 chars per generation
Commercial license
Higher cloning limits
Best for distribution

Get Creator

For perspective on the math: PRO at $19 a month is roughly 1 to 2% of a typical session's billable rate, and PRO covers about 14 hours of audio output per month. If you produce one audio worksheet a week and listen to two journal articles, you are nowhere near the cap. Lifetime ($199 for PRO, $349 for Creator) exists too. Some clinicians prefer the one-time payment so it never appears as a recurring expense in their practice accounting.

Quick answers

FAQ for therapists

Is text-to-speech HIPAA compliant for therapists?

Text-to-speech itself is HIPAA-neutral. The risk is what you put in. Convert only de-identified, generic materials such as psychoeducation, coping skills, and mindfulness scripts. Never paste session notes, client names, MRNs, or anything tied to a specific person. FreeTTS does not retain text after generation; audio files are deleted after 30 days on paid plans and immediately for anonymous sessions. The HIPAA-safe checker above runs locally in your browser and catches common PHI patterns before you copy anything anywhere.

Can I read my session notes back to proofread?

Yes, but only after stripping identifying details. A common workflow: copy the note, replace the client name with 'the client,' delete the date, remove any IDs or contact info, then paste into FreeTTS, listen at 1.2x. You catch awkward phrasing and missed sections far faster than re-reading silently. Many therapists report cutting note-revision time roughly in half. A handful report catching errors they had been missing for years.

What is the best voice for guided relaxation scripts?

A warm female voice at 0.85 to 0.9 speed. Ava (American English, multilingual) and Jenny (American English) are the most reliable picks for relaxation content. Pair with SSML break tags between cues so the silence carries the regulation work. Note: SSML break and phoneme tags pass through on the PRO and Creator endpoints; the free tier builds SSML internally and does not accept user markup. Test the first 30 seconds before generating an 8-minute script.

Will my clients know the audio is AI?

Most will not, on neural voices. The only reliable tells are unusual stress patterns on rare words and the absence of breath sounds. For long-form material (15 minutes plus) or anything emotionally heavy, voice cloning is available on PRO and Creator. Many clinicians tell clients up front that they use AI-narrated audio for between-session content. Most clients respond positively to the transparency, especially when they realize it means more consistent material.

Is FreeTTS or Otter.ai the right tool for therapists?

Different categories. Otter transcribes spoken audio into text, with HIPAA available only on Enterprise plans (custom pricing, sales call required). FreeTTS converts text into audio, free to try, $19 a month for unlimited use. If you want to dictate notes by speaking, you need transcription. If you want to listen to written content or build audio handouts, you need TTS. Many therapists use both, on different parts of their week.

Can I bill audio handouts as part of my service?

The Creator plan ($39/mo) includes a commercial license, which covers distributing audio materials to your own clients as part of paid therapy and workshop-scale distribution. PRO ($19) covers personal use plus client-facing handouts in the course of treatment. If you are running a group practice or selling a self-help course, Creator is the right tier. If you are a solo clinician using audio for your own caseload, PRO is fine.

Try it on a real worksheet

Open the studio. Paste a generic CBT thought record. Hear it. The whole demo is 90 seconds. You will know if it fits your practice immediately.

Open the studio Worksheet workflow guide

Last reviewed April 2026. Sources cited: APA 2022 Practitioner Pulse, NIMH 2021, NASW workforce data 2021, NCLD/IDA 2020, Mayer Cognitive Theory of Multimedia Learning (2014), Norcross and Lambert (2018), HHS Healthy People 2030. Related guides: Therapy worksheet narration, CEU coursework, Voice cloning for clinicians.

Why therapists are converting clinical materials to audio

The HIPAA boundary: what is safe to convert and what is not

Where text-to-speech fits in a therapist's week

Proofread de-identified notes

Listen to journal articles

Build accessible client handouts

Voice selection for therapeutic content

CBT worksheet to client audio in three minutes

Strip identifiers

Paste into the studio

Add SSML pauses if needed

Generate and download MP3

Distribute through a HIPAA-aware channel

Serving neurodiverse clients with audio handouts

FreeTTS compared to Otter, Speechify, Descript

Pricing for solo practice vs group practice

Free

PRO

Creator

FAQ for therapists

Try it on a real worksheet

Why therapists are converting clinical materials to audio

The HIPAA boundary: what is safe to convert and what is not

Where text-to-speech fits in a therapist's week

Proofread de-identified notes

Listen to journal articles

Build accessible client handouts

Voice selection for therapeutic content

CBT worksheet to client audio in three minutes

Strip identifiers

Paste into the studio

Add SSML pauses if needed

Generate and download MP3

Distribute through a HIPAA-aware channel

Serving neurodiverse clients with audio handouts

FreeTTS compared to Otter, Speechify, Descript

Pricing for solo practice vs group practice

Free

PRO

Creator

FAQ for therapists

Try it on a real worksheet