PhD website

singing-driven interfaces for sound synthesizers


Together with the sound synthesis engine, the user interface, or controller, is a basic component of any digital music synthesizer and the primary focus of this dissertation. Under the title of singing-driven interfaces, we study the design of systems, that based on the singing voice as input, can control the synthesis of musical sounds.
From a number of preliminary experiments and studies, we identify the principal issues involved in voice-driven synthesis. We propose one approach for controlling a singing voice synthesizer and another one for controlling the synthesis of other musical instruments. In the former, input and output signals are of the same nature, and control to signal mappings can be direct. In the latter, mappings become more complex, depending on the phonetics of the input voice and the characteristics of the synthesized instrument sound. For this latter case, we present a study on vocal imitation of instruments showing that these voice signals consist of syllables with musical meaning. Also, we suggest linking the characteristics of voice signals to instrumental gestures, describing these signals as vocal gestures.
Within the wide scope of the voice-driven synthesis topic, this dissertation studies the relationship between the human voice and the sound of musical instruments by addressing the automatic description of the voice and the mapping strategies for a meaningful control of the synthesized sounds. The contributions of the thesis include several voice analysis methods for using the voice as a control input: a) a phonetic alignment algorithm based on dynamic programming; b) a segmentation algorithm to isolate vocal gestures; c) a formant tracking algorithm; and d) a breathiness characterization algorithm. We also propose a general framework for defining the mappings from vocal gestures to the synthesizer parameters, which are configured according to the instrumental sound being synthesized.
As a way to demonstrate the results obtained, two real-time prototypes are implemented. The first prototype controls the synthesis of a singing voice and the second one is a generic controller for other instrumental sounds.


Thesis. PhD dissertation (3.4 MB)
Presentation. Slides used in the public defense on 14/3/2008.


Voice-driven instrumental sound synthesis. Offline synthesis. Female voice controls the violin sound synthesis.
Voice-driven singing voice synthesis. Offline synthesis. A female input controls a synthresized male output.
More audio examples here


Prototype of voice-driven instrumental synthesis (1). Real-time VST plugin in a performance situation.
Prototype of voice-driven instrumental synthesis (2). Real-time VST plugin as an interactive user installation.
Prototype of voice-driven singing voice synthesis. Real-time VST plugin for expressive synthesis control.


Annotated database of syllabling recordings. Voice signal segmentation includes note onset and phonetics.
Results of a web survey on vocal imitation of musical instruments.

Contact: Jordi Janer, jjaner at
Music Technology Group, Universitat Pompeu Fabra, Barcelona (2008)