Text-to-Speech (TTS) is a technology that converts written text into spoken audio using synthetic voices.
Click here to access this TTS tool.
What Does TTS Do? 🗣️
TTS reads digital text aloud. For example:
You type "Hello, how are you?"
The TTS system uses a computer-generated voice to say it out loud.
How Does TTS Work? ❓
1. Text Input
You enter or upload the text you want to convert to speech.
2. Linguistic Analysis
The system analyzes pronunciation, punctuation, and intonation to prepare the text for natural-sounding speech.
3. Speech Synthesis
It generates audio using either pre-recorded voice samples or AI-generated voices.
What Settings Can I Adjust in CAMB.ai's TTS? ⚙️
Select Voice: You can choose a default voice from the portal or you can create and upload your own. Please check here about voices.
Language: Select the language of the input text.
Enter Your Text
Start by typing or pasting the text you want to convert into speech in the Input text box.
Select Your Intent
Choose what you’re trying to build from the Select your intent dropdown. This helps optimize the speech output for your use case.
Expressive Dubbing
Best for emotional and performance-driven voiceovers, such as movies or dramatic scenes.Audiobooks
Designed for long-form narration with clear pronunciation and consistent pacing.Digital Media
Ideal for ads, social media videos, explainer videos, and online marketing content.Real-time Voice Agents
Optimized for interactive systems that respond instantly, such as virtual assistants.Call Centers
Suitable for IVR systems, customer support, and automated call handling.Live Conversational AI
Built for natural, back-and-forth conversations with human-like flow.Film & TV Dubbing
Focuses on professional dubbing with natural timing and cinematic delivery.Precise Prosody Control
Enables fine control over tone, pitch, pauses, and emphasis for advanced voice tuning.Creative Editing Workflows
Designed for post-production, sound design, and creative audio editing workflows.
Choose Model
Select the voice generation model. By default, MARS8-Pro is used for high-quality results.
Language and Voice
Source Language – Select the language of your input text.
Voice – Choose a voice style that fits your content (for example, Sports Commentary).
Output Type – Select the audio format for the generated file (e.g., FLAC).
Note: The model works best when the selected voice matches the source language.
Generate Speech
Once everything is set, click Start Generating Speech to create your audio output.
You can also explore automation by clicking Try it as an API.
How to Generate and Download Audio? ▶️
Steps:
Click on "Generate Speech" to create the audio from your text input.
Once the audio is generated, you can listen to it and download the output by clicking on "Download".
You can try TTS by clicking 'Try with sample text', then click 'Generate Speech' to hear the sample audio.
Voice Settings (Advanced)
The Voice Settings section allows you to fine-tune how the generated voice sounds. These options help improve clarity, accuracy, and overall voice quality. All settings are optional and can be enabled or disabled as needed.
Clean Reference
This option cleans the reference audio used for voice generation.
Enable it to reduce minor inconsistencies and improve overall voice clarity.
Enhance Reference Audio Quality
Improves the quality of the reference audio by enhancing sharpness and reducing distortions.
Recommended when the uploaded reference audio is not studio-quality.
Maintain Source Accent
Preserves the original accent of the source voice.
Turn this on if you want the generated output to stay true to the speaker’s natural accent.
Enhance Named Entities
Improves the pronunciation of proper nouns such as names, places, brands, and technical terms.
Useful for professional content, narration, and informational audio.
Output Configuration
The output configuration section allows you to control how the final audio is generated and exported.
These settings help align the voice output with your project requirements.
Applying Settings
Each setting can be toggled on or off individually.
Once you’ve selected the required options, proceed with generating the audio to apply the changes.





