singing-driven interfaces for sound synthesizers
ABSTRACT
Together with the sound synthesis engine, the user interface, or controller, is a basic component of
any digital music synthesizer and the primary focus of this dissertation. Under the title of singing-driven
interfaces, we study the design of systems, that based on the singing voice as input, can
control the synthesis of musical sounds.
From a number of preliminary experiments and studies, we identify the principal issues involved
in voice-driven synthesis. We propose one approach for controlling a singing voice synthesizer
and another one for controlling the synthesis of other musical instruments. In the former, input and
output signals are of the same nature, and control to signal mappings can be direct. In the latter,
mappings become more complex, depending on the phonetics of the input voice and the characteristics
of the synthesized instrument sound. For this latter case, we present a study on vocal imitation
of instruments showing that these voice signals consist of syllables with musical meaning. Also, we
suggest linking the characteristics of voice signals to instrumental gestures, describing these signals
as vocal gestures.
Within the wide scope of the voice-driven synthesis topic, this dissertation studies the relationship
between the human voice and the sound of musical instruments by addressing the automatic
description of the voice and the mapping strategies for a meaningful control of the synthesized
sounds. The contributions of the thesis include several voice analysis methods for using the voice
as a control input: a) a phonetic alignment algorithm based on dynamic programming; b) a segmentation
algorithm to isolate vocal gestures; c) a formant tracking algorithm; and d) a breathiness
characterization algorithm. We also propose a general framework for defining the mappings from
vocal gestures to the synthesizer parameters, which are configured according to the instrumental
sound being synthesized.
As a way to demonstrate the results obtained, two real-time prototypes are implemented. The
first prototype controls the synthesis of a singing voice and the second one is a generic controller for
other instrumental sounds.
DISSERTATION
Thesis. PhD dissertation (3.4 MB)
Presentation. Slides used in the public defense on 14/3/2008.
AUDIO DEMOS
Voice-driven instrumental sound synthesis. Offline synthesis. Female voice controls the violin sound synthesis.
Voice-driven singing voice synthesis. Offline synthesis. A female input controls a synthresized male output.
More audio examples
here
VIDEO DEMOS
Prototype of voice-driven instrumental synthesis (1). Real-time VST plugin in a performance situation.
Prototype of voice-driven instrumental synthesis (2). Real-time VST plugin as an interactive user installation.
Prototype of voice-driven singing voice synthesis. Real-time VST plugin for expressive synthesis control.
ADDITIONAL MATERIAL
Annotated database of syllabling recordings. Voice signal segmentation includes note onset and phonetics.
Results of a web survey on vocal imitation of musical instruments.
Contact: Jordi Janer, jjaner at iua.upf.edu
Music Technology Group, Universitat Pompeu Fabra, Barcelona (2008)