Quantitative and graphic acoustic analysis of phonatory modulations: The modulogram


A method is presented for analyzing phonatory instabilities that occur as modulations of fundamental frequency (f0) and sound pressure level (SPL) on the order of 0.2 to 20 cycles per second. Such long-term phonatory instabilities, including but not limited to traditional notions of tremor, are distinct from cycle-to-cycle perturbation such as jitter or shimmer. For each of the 2 parameters (f0, in Hz, and SPL, in dB), 3 frequency domains are proposed: (a) flutter (10-20 Hz), (b) tremor (2-10 Hz), and (c) wow (0.2-2.0 Hz), yielding 6 types of instability. Analyses were implemented using fast Fourier transforms (FFTs) with domain-specific analysis parameters. Outputs include a graphic display in the form of a set of low-frequency spectrograms (the "modulogram") and quantitative measures of the frequencies, magnitudes, durations, and sinusoidal form of the instabilities. An index of a given instability is developed by combining its duration and average modulation magnitude into a single quantity. Performance of the algorithms was assessed by analyzing test signals with known degrees of modulation, and a range of applications was reviewed to provide a rationale for use of modulograms in phonatory assessment.

Publication Title

Journal of Speech, Language, and Hearing Research