SOTA VOX Kit

AI platform for creating products and business solutions based on speech technologies.

Single API for access to all services

AREAS OF APPLICATION

ASR / Speech recognition
Integrate speech recognition support into any apps, services, and bots.
NLU / Natural language processing and understanding
Use advanced text analysis features to extract meaningful data, named entities, topics, facts, relationships, and keywords.
Voice ID / Voice identification
Improve security and speed of service with text-independent voice recognition in any language and with high accuracy.
Speech analytics systems
Solutions for automating customer communication analysis and service quality control.
Voice robots
We will teach your voice robots and assistants to communicate in natural language.
Meeting minutes
Use SOTA VOX Kit in offline meeting and online conference recording systems.
Subtitles for TV and movies
Create subtitles for TV shows, broadcasts, podcasts or videos.
Protecting your business from
fraud
Customer recognition by voice in any language. Reduce customer identification time and minimize fraud risks.
Content dubbing
Voice-over any content: videos, audiobooks, instructions, website interface.

Functional features

Functional features
Using pauses
Pronouncing abbreviations
Support for audio templates and pre-recorded audio
Speech synthesis in Russian
Speech intonation in accordance with generally accepted rules
Automatic stress placement
Speech recognition
Automatic speaker separation in mono recordings
Automatic language detection
Determination of gender, age and emotions in the operator and client channel
Transcription with over 95% accuracy
Supported languages:
- Russian,
- English,
- Kazakh and Uzbek, including mixed speech
Functional features
Text-independent technology, does not depend on language
Creating voice "casts" of 20 seconds of natural speech
Voice identification with an accuracy of up to 98%
Voice identification and verification from 5 sec

SOTA VOX Kit

Voice biometrics
Text-independent voice biometrics module for identifying and searching target voices in audio recordings
Speech recognition
Intelligent speech recognition (ASR) engine with learning capability for improved accuracy
Knowledge extraction
Text analytics engine (NLP|NLU) for understanding the meaning and extracting relevant data given the context
SOTA VOX API
Flexible, secure and fast API

Technical features

SOTA VOX Kit automatically inserts punctuation marks in transcripts. Sentences and proper names begin with capital letters. Thanks to this, working with the text is comfortable, and the transcript is not inferior in quality to manual formatting.

Each transcript is automatically time-stamped for each word, allowing you to quickly find specific fragments in the original audio or link subtitles by timestamp.

You can add new words to the core dictionary to get the most accurate translations of words and phrases related to a specific subject area, such as product names, technical terminology, or names of individuals.

Stream mode allows you to process records in a mode close to real time. The MRCPv2 protocol is supported.

Ability to flexibly customize a list of words or phrases that will be removed from the transcript, such as obscene language, commercial information, or personal data.

Automatic separation of speakers, for example in mono recordings, where the operator and the client are recorded in one channel. The use of the diarization mechanism significantly improves the quality of recognition and the convenience of further work with the text transcript.