Speech-to-Speech

Also known as: S2S, Speech-to-Speech Conversion

A class of systems that transform one speech signal directly into another — for example, converting atypical input (whispered, dysarthric, accented, or cross-lingual speech) into clear, intelligible output in a target voice or language. Speech-to-speech systems differ from conventional ASR-then-TTS pipelines by avoiding an intermediate text representation, which can preserve prosody and paralinguistic information but also complicates alignment when input and output timing differ. For accessibility, speech-to-speech restoration offers a path to inclusive voice interaction for people with speech disorders without requiring them to switch to text or specialised hardware.

Category: Speech Technology · AI · Speech Accessibility

Related: Voice Conversion · Speech Synthesis · Automatic Speech Recognition · Text-to-Speech

Sources

https://doi.org/10.1145/3772318.3791734