We address the expressive control of singing voice synthesis. Singing Voice Synthesizers (SVS) traditionally require two types of inputs: a musical score and lyrics. The musical expression is then typically either generated automatically by applying a model of a certain type of expression to a high-level musical score, or achieved by manually editing low-level synthesizer parameters. We propose an alternative method, where the expression control is derived from a singing performance. In a first step, an analysis module extracts expressive information from the input voice signal, which is then mapped to the internal synthesizer controls. The presented implementation works in an off-line manner processing user input voice signals and lyrics using a phonetic segmentation module. Our approach offers a direct way of controlling the expression of SVS. The last section of this paper addresses a possible strategy for real-time operation.
Sound examples:
| performance 1 | synthesis 1 |
|---|---|
| performance 2 | synthesis 2 |
| performance 3 | synthesis 3 |
| performance 4 | synthesis 4 |
All sound examples use the same singer database.
Contact: Jordi Janer, jjaner at iua.upf.edu