MARS8-PRO: Best Practices & Troubleshooting Guide

Why are voiceovers sometimes cut, or why are certain words not spoken and skipped entirely?

Issue

Some words in the voiceover are skipped, cut off, or not spoken at all.

Resolution

Identify and select the dialogue where words are missing or cut off.
Increase the dialogue duration by:
- Extending it directly on the waveform, or
- Manually adjusting the dialogue timestamps.
Regenerate the dialogue.

What to Expect?

After increasing the duration and regenerating the dialogue, the full dialogue should be spoken correctly. If words are still missing, slightly increase the duration again and regenerate.
If increasing the duration is not possible, shorten the sentence manually or use the Optimize Duration option when retranslating.

Why This Happens?

MARAS8-Pro, optimized for dubbing, generates voiceovers to fit within the assigned dialogue duration. When the duration is too short, the model may skip words or abruptly cut the audio.

Why does the speaker’s voice in the target language sometimes sound inconsistent with the speaker’s voice in the source language?

Issue

The speaker’s voice in the translated output sounds different from the original—unnatural, inconsistent, or mismatched in tone or character.

Resolution

Ensure each dialogue is assigned to the correct speaker.
Verify that the speaker's gender matches the original speaker.

Confirm that dialogues are accurately aligned to the waveform, starting and ending exactly when the speaker begins and finishes speaking.

Remove any unnecessary silence at the beginning or end of dialogues.
If a dialogue contains noticeable pauses or gaps, split it into smaller segments and align each segment precisely.

Regenerate the output after making corrections.

What to Expect?

With correct speaker assignment, gender settings, and waveform alignment, the translated voice will sound much closer to the original speaker in tone, character, and consistency.

Why This Happens?

Some dialogues are too short to generate a reliable voice clone.
In these cases, the system reuses voice references from other dialogues.
Incorrect speaker assignment, gender mismatches, misaligned waveforms, or internal gaps can lead to unsuitable voice selection and audible inconsistencies.

Why does the volume of a dialogue in the translated output sometimes fluctuate between high and low levels?

Issue

The translated voiceover volume fluctuates noticeably, shifting between loud and quiet levels.

Resolution

Carefully review the audio content within each dialogue segment.
Ensure dialogues contain only clean, steady speech.
Avoid including laughter, shouting, whispering, muffled speech, or distorted audio.
If the speaker uses different emotional tones (e.g., excited, calm, serious), split these into separate dialogue segments.
Regenerate the output after cleaning and segmenting the dialogues.

What to Expect?

The translated output will have more stable volume levels and a smoother, more natural listening experience.

Why This Happens?

Voice generation is influenced by the source audio characteristics.
Mixed vocal elements or varying emotional tones within a single dialogue can cause unstable volume and pitch in the output.

Why does the speaker in the target language sometimes not sound similar to how they sound in the source language?

Issue

The translated speaker does not closely resemble the original speaker’s voice.

Resolution

Check whether the source dialogue contains enough clean, continuous audio to be used as a voice reference.
If not, create a dedicated voice reference in the voice library using a longer, high‑quality recording.

Assign the new voice reference to the speaker.
Regenerate the output.

What to Expect?

With a sufficiently long voice reference, the translated speaker will sound more natural and closely match the original voice.

Why This Happens?

Short audio samples do not provide enough data for accurate voice cloning. Longer, cleaner references allow the system to better capture vocal characteristics.

How do I add natural breaks or pauses in my dialogue?

To introduce natural pauses or breaks in a dialogue, you can insert the <|breath|> tag directly into the dialogue text. This tag tells the system to add a short, natural-sounding pause at that point in the voiceover.

How to Use?

Step 1:Edit the dialogue text where you want a pause to occur.
Step 2: Insert the <|breath|> tag at the desired position in the sentence.
- Example: Thank you for listening <|breath|> I needed this
Step 3: Regenerate the voiceover.

What to Expect?

A subtle pause will be added at the position of the <|breath|> tag. In the example above, the voiceover will pause briefly after the word “listening”, creating a more natural and expressive delivery.

Best Practices:

Use <|breath|> sparingly to avoid over-pausing.
Place the tag where a natural breath or pause would occur in normal speech.
Avoid inserting the tag mid-word or too frequently within a single sentence.

How can I get the right accent for my voiceover?

You can control how the accent sounds in your generated voiceover by adjusting two settings: Accent Boost and Stability.

Accent Boost right accent :

Range: 0.0 → 1.0
Default: 0.0
Left (Lower values): Keeps the accent of the reference voice
Right (Higher values): Shifts toward a native accent in the target language

Description:

Controls how much the generated voice shifts accent-wise. Lower values preserve the original/reference accent, while higher values make the voice sound more native to the target language.

Stability for voice control:

Range: 0.0 → 1.0
Default: 0.7
Left (Lower values): Less stable, more variation in tone and accent
Right (Higher values): More stable, more consistent tone

Description:

Controls how steady and uniform the voice sounds. Higher values reduce accent variation and make the speech smoother and more neutral.

How to Use?

Step 1: Generate once using default settings.

Step 2: If the accent sounds correct → No changes needed.

Step 3: If adjustments are required:

Increase Accent Boost → More native target accent
Decrease Accent Boost → More reference/source accent

Step 4: Adjust Stability if the voice sounds too varied or inconsistent.

Recommended Accent Boost and Stability Values:

You can try the following accent boost values depending on the result you want:

0.0 — Fully keeps the reference voice accent
0.1 — Strong reference accent, very little change
0.3 — Mostly reference accent with slight softening
0.6 — Balanced mix of reference + native accent
0.8 — Mostly native target accent
0.9 — Strong native target accent

Test across these levels to find what sounds most natural for your project.

What to Expect?

Lower Accent Boost → Keeps the reference voice accent
Higher Accent Boost → Sounds more native to the target language
Higher Stability → More even tone, less accent variation

Troubleshooting:

If the output sounds distorted or none of the above works:

Set Stability to 1.0
Regenerate the output

This typically stabilizes the voice and reduces distortion artifacts.

How to Generate Voiceover? 🎙️

❓ What Is Text-to-Speech (TTS)?

Optimizing Speaker's Audio Settings