Skip to main content

What Is Text-to-Speech (TTS)?

Quick guide: Understand what Text-to-Speech is, how it functions, and how to manage its settings and downloads.

Updated this week

Text-to-Speech (TTS) is a technology that converts written text into spoken audio using synthetic voices.


What Does TTS Do? 🗣️

  • TTS reads digital text aloud. For example:

  • You type "Hello, how are you?"

  • The TTS system uses a computer-generated voice to say it out loud.

How Does TTS Work? ❓

1. Text Input

You enter or upload the text you want to convert to speech.

2. Linguistic Analysis

The system analyzes pronunciation, punctuation, and intonation to prepare the text for natural-sounding speech.

3. Speech Synthesis

It generates audio using either pre-recorded voice samples or AI-generated voices.

What Settings Can I Adjust in CAMB.ai's TTS? ⚙️

  • Select Voice: You can choose a default voice from the portal or you can create and upload your own. Please check here about voices.

  • Gender: Choose Male, Female, or Neutral voice.

  • Language: Select the language of the input text.

  • Output Format: Choose from:

    • .flac – Free Lossless Audio Codec

    • .wav – Waveform Audio File Format

    • .adts – Audio Data Transport Stream

    • .pcm_s16le – 16-bit signed little-endian PCM

    • .pcm_s32le – 32-bit signed little-endian PCM
      Settings


How to Generate and Download Audio? ▶️

Steps:

  1. Click on "Generate Speech" to create the audio from your text input.

  2. Once the audio is generated, you can listen to it and download the output by clicking on "Download".


Can I Access My Previous TTS Outputs? (TTS History) 🕒

  • Yes, you can view your TTS history to access previously generated audio files.

Did this answer your question?