Convert any CBT, DBT, or mindfulness worksheet into audio with a warm AI voice for $19 a month. Audio worksheets get completed at meaningfully higher rates than PDF handouts, especially among clients with ADHD, dyslexia, or auditory-learning preferences. PRO at $19/mo handles 99% of the workflow with the studio's built-in warm voices. Established practices that want their own voice on every handout can add cloning later. No IT approval, no enterprise sales call. Free to try right now.
PRO: $19/mo. Creator: $39/mo (commercial license, voice cloning, distribution rights).
30-second preview
A 5-4-3-2-1 grounding script narrated in three different ways. Ava is the default warm female PRO voice; Andrew is the calm male alternative; the whisper-soft read shows the prosody control that PRO unlocks for relaxation work.
Press play on any card
Aria, warm female
Aria standard voice, default cadence
Guy, calm male
Guy standard voice, default cadence
Real whispering
Aria with the whispering expressive style
Free tier covers all 400+ standard voices at default cadence. The whispering sample uses Aria with the real expressive whispering style, unlocked on PRO ($19/mo) along with 94 other emotional styles and Multilingual HD voices.
See PRO pricing →The completion problem
Audio versions of CBT and DBT worksheets see meaningfully higher between-session homework completion than PDF versions.
Therapy homework completion is the dirty secret of evidence-based practice. Research consistently finds substantial non-completion: figures around 30 to 50 percent are widely cited in the CBT literature, with methodology varying study to study (Cognitive and Behavioral Practice work has tracked this for years). Whatever the exact number, the honest version is: a meaningful chunk of clients never open the PDF you handed them Tuesday afternoon.
The reasons cluster around friction. The PDF is buried in their email. They forgot to print it. The seven thought-record columns look intimidating on a small screen. Reading after work is exhausting. They had ADHD and the wall of text shut them down before they parsed the first prompt. Pick your favorite. They are all real.
Audio dissolves most of those frictions in one move. A 4-minute MP3 plays on the bus, during dishes, on a walk, in the car between work and pickup. Clients with ADHD listen while pacing. Clients with dyslexia listen instead of decoding. Clients with auditory preference (a much bigger group than you would expect) just absorb it better as audio. You did not change the content. You changed the format. Completion goes up.
And here is the thing. Most therapists assume audio is a niche accommodation for a handful of clients. In practice, it is closer to a default-on improvement. Once families and clients have it as an option, they stop asking for the PDF.
Format mismatch
Therapist Aid is great. Beck Institute templates are great. Whatever PDFs your training program shipped with are probably great. The gap is not the content. The gap is the format.
Roughly 15 to 20 percent of the U.S. population has a language-based learning disability such as dyslexia (NCLD/IDA, 2020). Adult ADHD prevalence sits in the 4 to 5 percent range. Anxiety disorders affect upwards of 30 percent of U.S. adults at some point. These overlap, and they are over-represented in mental-health caseloads, because they are often the reason people end up in your office. The math says a meaningful chunk of your clients would prefer audio if it were available, even if they have not specifically requested it.
Multimodal learning is well documented. Mayer's Cognitive Theory of Multimedia Learning (peer-reviewed, 2014, and the foundation textbook for instructional design) shows that audio plus visual outperforms either modality alone for retention. Translated to practice: handing a client a printed worksheet AND a 4-minute audio narration of the same content does not just help neurodiverse clients. It helps everyone, a little.
The biggest wins
Three groups see the biggest improvements when you offer audio versions:
Clients with ADHD. The ability to listen while moving, fidgeting, or doing something with their hands turns 4 minutes of unbearable focus into 4 minutes of easy absorption. Many ADHD clients report this is the single most useful change a therapist has ever made to their homework experience.
Clients with dyslexia or low literacy. Reading is cognitively expensive. Listening is cheap. For these clients the math is not subtle. Audio is the difference between doing the homework and not.
Clients with high anxiety. Sustained reading when anxious is hard. The eyes skip lines, the brain rereads, focus collapses. A calm voice walking through the same content removes the cognitive overhead. Many clinicians notice their anxious clients are the ones most enthusiastic about audio versions.
And then there is the cross-cutting group nobody really tracks: clients who simply have auditory-learning preference. Estimates vary wildly because "learning style" research has been a mess for years, but the practical observation is real. Some people just absorb audio better. Offer both formats. Let them pick.
The workflow most therapists actually use
The fastest way to convert a therapy worksheet to audio is FreeTTS PRO at $19/mo, which renders a 1,000-word handout in roughly 30 seconds with a warm AI voice.
Here is the part marketing brochures usually skip. You do not need voice cloning. You do not need the Creator tier. You do not need a $99 per month enterprise plan. PRO at $19 a month covers the realistic scope of a solo or small-group therapy practice producing audio handouts.
What PRO gets you, in plain English: 1 million characters of generation per month (roughly 14 hours of audio output), up to 10,000 characters per single generation (enough for a 12-minute body scan in one go), HD neural voices including warm female and male options, no watermark on output, SSML support for the pause tags that make relaxation scripts work, and voice cloning if you want it later. All for $19. No setup fee. No annual lock-in. Cancel from a settings page.
In practice: a typical practice produces 3 to 5 new audio worksheets a month after the initial library is built. Each is 200 to 800 words. That is roughly 3,000 to 4,000 characters of new content per month, against a 1 million character cap. You will never hit the limit on PRO unless you are actively running an online course, in which case Creator is the right tier.
The conversion list
Some worksheets convert beautifully to audio. Others do not. Here is the list that does:
Sequential body relaxation with timed pauses. Audio is the natural format.
Slow guided attention through body regions. Pacing is the entire experience.
Audio narration of the prompts plus printed worksheet for the writing portion.
Reading the steps aloud reduces avoidance better than scanning a list.
TIPP, ACCEPTS, IMPROVE walkthroughs. Audio works in the moment of distress.
5-4-3-2-1, ice grounding, sensory anchoring. Same logic as DBT.
Spoken in a warm voice, repeated. Audio is the obvious format.
Bedtime regulation, slow voice at 0.85x, longer pauses.
What does not convert well: anything that is mostly a grid or table (Socratic questioning matrices, multi-column behavior logs), anything where the visual layout carries the meaning, and anything that requires the client to circle multiple-choice answers. For those, hybrid (printed worksheet plus an optional audio narration of the instructions) is the move.
Step by step
Visit freetts.org/text-to-speech. No login required to test. Have your worksheet ready in another tab or document so you can copy from it.
Pick a thought record template, a body scan script, a distress-tolerance walkthrough, or any CBT/DBT/mindfulness handout you would hand a stranger in a workshop. No client names, no dates, no identifying details.
Ava or Jenny work well for relaxation work. Drop the speed to 0.85x or 0.9x. Listen to the first 30 seconds before generating the whole thing. If the cadence is off, swap voices and try again.
For body scans, breathing exercises, or guided imagery, insert pause tags between cues. Example: 'breathe in. <break time="800ms"/> breathe out.' SSML break and phoneme tags pass through on PRO and Creator. The pauses are where the regulation work happens.
Click Generate, click Download, drop the MP3 into your client portal (SimplePractice, TheraNest, Hushmail) or password-protected practice site. Source text deletes server-side. Audio files retain for 30 days on paid plans.
Build it once, use it for years
Most therapists who do this end up with a rotating set of about 12 to 15 audio handouts that cover the bulk of weekly homework. Build the starter library in one focused 2 to 3 hour block. After that, you add a new one only when a specific client need pops up. The same files keep getting reused for years. Highest-ROI admin time of the year.
A practical starter list: PMR (long version), PMR (short version), body scan, square-breathing exercise, 5-4-3-2-1 grounding, urge-surfing script, self-compassion break, RAIN (recognize/allow/investigate/nurture), thought-record narration, behavioral activation menu, sleep wind-down, and one general-purpose affirmation set. That covers roughly 80 percent of weekly homework needs across most therapy modalities.
Advanced upgrade
Voice cloning is an advanced upgrade for therapy practices, not a requirement. Most worksheets work just as well with standard neural voices.
For established practices with strong therapeutic alliances, group practices that want consistent voicing across clinicians, or anyone running a paid online course where voice consistency is part of the brand, voice cloning makes sense. The Creator plan at $39 a month includes higher monthly limits, multi-clone support, and most importantly the commercial license that covers distributing audio handouts as part of paid therapy and workshop-scale distribution. Voice cloning is also available on PRO for personal use, with lower limits.
Honest framing: cloning is not a requirement. It is a nicety. Most therapists who try the standard voices on PRO never bother upgrading. The bar is not "does it sound like me." The bar is "does it sound calm and warm enough that clients actually listen." Standard PRO voices clear that bar fine.
Distribution that does not get you in trouble
Generic therapeutic audio is low risk when distributed properly. The keyword is "generic": no client names in the audio, no identifying details, content you would happily hand a stranger in a workshop. Once you keep the content clean, distribution is straightforward.
Channels that work cleanly: SimplePractice client portal, TheraNest portal, Hushmail encrypted email, Box for Healthcare, password-protected page on your practice site, private link with a 7-day expiry. Channels to avoid for any audio that could connect to a specific client: regular email, Dropbox public links, anything posted on social media, unencrypted text message attachments.
The Creator plan at $39 a month includes a commercial license that explicitly covers distributing audio handouts as part of paid therapy and workshop-scale distribution. PRO covers personal use plus client-facing handouts in the course of treatment. If you are running a group practice or selling content as a course, Creator. Otherwise PRO is fine.
Real comparison
Therapists choose FreeTTS PRO over Therapist Aid because PRO converts any worksheet to audio, while Therapist Aid only offers pre-made PDF templates.
| Tool | What it gives you | Best for | Price (verify) |
|---|---|---|---|
| FreeTTS PRO | Convert any text into audio. Use any worksheet you want. | Audio versions of YOUR existing worksheets, in YOUR voice if you clone. | $19/mo |
| Therapist Aid (Plus) | Premium pre-made PDF worksheets. No audio conversion. | Therapists who want curated PDF templates and do not need audio. | ~$79/yr (verify) |
| Sanvello | Mental-health app for clients with built-in coping content. | Clients who want a self-help app between sessions, not therapist tooling. | ~$9/mo (verify, varies) |
| Recording yourself | Real you, real voice, real time. | Practices that have already invested in mic + editor + a willing clinician. | Mic ~$100, editing time = your hourly rate |
These are complementary, not competing. Many therapists use Therapist Aid PDFs as templates, then run them through FreeTTS to make audio versions. The combination is cheaper than either alone done well.
Pick the right tier
Quick answers
Open the studio. Paste a generic CBT thought record or a body scan script. Hear it. The whole demo takes about 90 seconds. You will know if it fits your practice immediately.
Last reviewed April 2026. Sources cited: NCLD/IDA 2020, Mayer Cognitive Theory of Multimedia Learning (2014), Cognitive and Behavioral Practice (homework completion literature), Norcross and Lambert (2018). Related guides: TTS for therapists, Voice cloning details, All pricing.