Concatenative Synthesis

Also known as: Unit Selection Synthesis

A text-to-speech method that generates synthetic speech by concatenating (joining together) pre-recorded segments of human speech. These segments, called units, may be phonemes, diphones, syllables, or words. The system selects and joins appropriate units from a large database to produce natural-sounding output. Concatenative synthesis typically produces more natural-sounding speech than formant synthesis but requires substantial storage for the speech database. VoiceOver's Alex voice and many commercial TTS systems use this approach.

Category: Speech Technology · Assistive Technology

Related: Speech Synthesis · Text-to-Speech · Formant Synthesis · Synthetic Speech

Sources

https://doi.org/10.1145/3461700
https://en.wikipedia.org/wiki/Speech_synthesis#Concatenative_synthesis