SOTA VOX TTS Speech synthesis

Voice texts and content. Use for robots, calls and voice menus. Create your own unique voice for any task.

Advantages
Brand Voice - We will help you create a unique voice for voicing your branded content on any channel.
Cloud | On-Premis
Possibility to use as a cloud service or deploy software on your GPU servers.
Realistic voices
High-quality speech synthesis based on the neural network architecture Tacotron2 and WaveNet.
High speed
Minimal pauses and delays in voice acting for more realistic dialogue.
API
REST API and gRPC support. Easy integration via HTTP/HTTPS protocol.

Synthesize high-quality
voice for any task

IVR
Telephony
Microphone
Alert systems

Technical features

Russian language:
1. Male
2. Female

‍English language:
1. Male
2. Female

‍Kazakh language:
1. Male
2. Female

‍Uzbek language:
1. Male
2. Female

1. 64-bit Linux-based operating system (CentOS not lower than 7.X, Debian not lower than 10.x, Astra Linux not lower than 1.7)
2. docker, nvidia-docker, docker-compose
3. cuda driver version 10.2+

- 2 CPUs (2 physical cores) with a frequency of 2.4 GHz
- NVIDIA GPU with CUDA support and at least 8 GB of RAM
- 8 GB of RAM
- 30 GB of disk space

1. Sampling frequency 22050 Hz
2. Codec pcm_s16le
3. Number of channels 1