Text-to-Speech (TTS) is a technology that converts written text into spoken audio using synthetic voices.
What Does TTS Do? 🗣️
TTS reads digital text aloud. For example:
You type "Hello, how are you?"
The TTS system uses a computer-generated voice to say it out loud.
How Does TTS Work? ❓
1. Text Input
You enter or upload the text you want to convert to speech.
2. Linguistic Analysis
The system analyzes pronunciation, punctuation, and intonation to prepare the text for natural-sounding speech.
3. Speech Synthesis
It generates audio using either pre-recorded voice samples or AI-generated voices.
What Settings Can I Adjust in CAMB.ai's TTS? ⚙️
Select Voice: You can choose a default voice from the portal or you can create and upload your own. Please check here about voices.
Language: Select the language of the input text.
How to Generate and Download Audio? ▶️
Steps:
Click on "Generate Speech" to create the audio from your text input.
Once the audio is generated, you can listen to it and download the output by clicking on "Download".
Can I Access My Previous TTS Outputs? (TTS History) 🕒
Yes, you can view your TTS history to access previously generated audio files.
You can try TTS by clicking 'Try with sample text', then click 'Generate Speech' to hear the sample audio.
Customize Speech Using SSML Tags
SSML (Speech Synthesis Markup Language) allows you to control how text is spoken by a text-to-speech system. By adding SSML tags to your script, you can introduce pauses, fine-tune pronunciation, and improve the overall natural flow of speech.
Below are some commonly used SSML tags that help enhance voice output:
<break>
Use this tag to insert pauses between words or phrases, giving you better control over speech pacing and emphasis.
Example:
I can help you join your <break time="1.5s"/> tomato fast.
This adds a 1.5-second pause before the word “tomato”.
<phoneme>
This tag lets you customize how a word is pronounced using phonetic notation (such as IPA).
Example:
I can help you join your <phoneme alphabet="ipa" ph="təˈmɑː.tuː">tomato</phoneme> fast.
This ensures the word “tomato” is pronounced exactly as specified using IPA.
💡 Tip: Using SSML thoughtfully can make your voiceovers sound more natural, expressive, and listener-friendly.





