Performance-driven control for sample-based singing voice synthesis


We address the expressive control of singing voice synthesis. Singing Voice Synthesizers (SVS) traditionally require two types of inputs: a musical score and lyrics. The musical expression is then typically either generated automatically by applying a model of a certain type of expression to a high-level musical score, or achieved by manually editing low-level synthesizer parameters. We propose an alternative method, where the expression control is derived from a singing performance. In a first step, an analysis module extracts expressive information from the input voice signal, which is then mapped to the internal synthesizer controls. The presented implementation works in an off-line manner processing user input voice signals and lyrics using a phonetic segmentation module. Our approach offers a direct way of controlling the expression of SVS. The last section of this paper addresses a possible strategy for real-time operation.


Sound examples:

performance 1 synthesis 1
performance 2 synthesis 2
performance 3 synthesis 3
performance 4 synthesis 4

All sound examples use the same singer database.


Contact: Jordi Janer, jjaner at