SpeechMark Acoustic Landmark Tool: Application to Voice Pathology (2013)

Suzanne Boyce, Marisha Speights, Keiko Ishikawa, Joel MacAuslan

One area of voice research that has historically been understudied is the interaction between voice pathology and acoustic aspects of the speech signal that affect intelligibility. Landmark-based software tools are particularly suited to fast, automatic analysis of small, non-lexical differences in the acoustic signal reflecting the production of speech. We are building a tool set that provides fast, automatic summary statistics for measures of speech acoustics based on Stevens’ paradigm of landmarks, points in an utterance around which information about articulatory events can be extracted. This paper explores the use of landmark analysis for evaluation of intelligibility-based measures of vocal pathology. Index Terms: speech analysis, landmarks, voice pathology.

Read More… Download PDF

SpeechMark: Landmark Detection Tool for Speech Analysis (2012)

Suzanne Boyce, Harriet Fell, Joel MacAuslan

Landmark-based software tools are particularly suited to fast, automatic analysis of small, non-lexical differences in production of the same speech material by the same speaker. We are building a suite of independent applications and plugins as toolkits that make our landmark-based software system, SpeechMark, available to the wider scientific community. This will be achieved by extending existing software platforms with “plug-ins” that perform specific measures and report results to the user and by developing a MATLAB toolkit. These tools provide automatic summary statistics for measures of speech acoustics based on Stevens’ paradigm of landmarks, points in an utterance around which information about articulatory events can be extracted.

Read More… Download PDF

Automated Tools for Identifying Syllabic Landmark Clusters that Reflect Changes in Articulation (2011)

Suzanne Boyce, Harriet Fell, Lorin Wilde, Joel MacAuslan

We have developed a set of software tools to detect articulatory changes in the production of syllabic units based on acoustic landmark detection and classification. Results from the application of this automatic analysis system to studies of Parkinson’s Disease and Sleep Deprivation show the ability to detect subtle change. We are making these tools available as add-ons to systems such as Wavesurfer and R.

Read More… Download PDF

A Platform for Automated Acoustic Analysis for Assistive Technology (2010)

Suzanne Boyce, Harriet Fell, Joel MacAuslan, Lorin Wilde
The use of speech production data has been limited by a steep learning curve and the need for laborious hand measurement. We are building a tool set that provides summary statistics for measures designed by clinicians to screen, diagnose or provide training to assistive technology users. This will be achieved by extending an existing shareware software platform with “plug-ins” that perform specific measures and report results to the user. The common underlying basis for this tool set is a Stevens’ paradigm of landmarks, points in an utterance around which information about articulatory events can be extracted.

Read More… Download PDF

WaveSurfer and SpeechMark Configuration

WaveSurfer makes use of text configuration files to allow the user to specify and automatically set up and reuse specific configurations of panes for purposes of speech analysis, recording, labeling, and so forth. WaveSurfer comes with a handful of predefined configurations, and users can easily define and use their own custom configurations.

Read More… Download PDF

SpeechMark family of products

The SpeechMark family of products is designed to detect acoustic landmarks in speech recordings. Landmarks are acoustic events that correlate with changes in speech articulation. The SpeechMark family comprises plug-ins that augment the capabilities of existing third-party software, as well as stand-alone libraries and command line utilities.

How are Acoustic Landmarks Detected?

The landmark detection process begins by analyzing the signal in several broad frequency bands. Because of the different vocal-tract dimensions, the appropriate frequencies for the bands are different for adults and infants; however, the procedure itself does not vary. First, an energy waveform is constructed in each of the bands. Then the rate of rise (or fall) of the energy is computed, and peaks in the rate are detected. These peaks therefore represent times of abrupt spectral change in the bands. simultaneous peaks in several bands identify consonantal landmarks.

What are Acoustic Landmarks?

In a significant body of work spanning several decades, Stevens and colleagues suggested that the speech signal can be usefully analyzed in terms of landmarks—that is, acoustic events that correlate with changes in speech articulation [1]. Most research using the landmark approach has focused on the lexical content of speech [2][3]. In our work [4][5] , we have found that tools based on landmarks can be useful for investigating non-lexical attributes of speech, such as syllabic complexity or vowel space area over time. In particular, we have found that landmark-based software tools are well suited for analysis of subtle differences in production of the same speech material by the same speaker.

How are Acoustic Landmarks Detected?

The landmark detection process begins by analyzing the signal in several broad frequency bands. Because of the different vocal-tract dimensions, the appropriate frequencies for the bands are different for adults and infants; however, the procedure itself does not vary. First, an energy waveform is constructed in each of the bands. Then the rate of rise (or fall) of the energy is computed, and peaks in the rate are detected. These peaks therefore represent times of abrupt spectral change in the bands. Simultaneous peaks in several bands identify consonantal landmarks.

[1] Stevens, K.N., et al. “Implementation of a Model for Lexical Access based on Features”, in International Conference on Spoken Language Processing (ICSLP) Proc., 1992.
[2] Juneja, A. and C.Y. Espy-Wilson. “Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines”, in International Joint Conference on Neural Networks Proc., 2003.
[3] Slifka, J.S., et al. “A Landmark-Based Model of Speech Perception: History and Recent Developments”, in From Sound to Sense: Fifty Years of Speech Research, 2004.

Marking Adult Vowel-Space Formant Boundaries

The usual SpeechMark® vowel-space plot for adults includes a polygon that marks the boundaries of typical formant-frequency (F1, F2) pairs for normal adult speakers. The boundary drawn depends on whether the sex of the actual speaker has been specified as male, female, or unknown. The polygon is intended solely as a “fiducial” reference (an aid to the eye) much like grid lines. Like grid lines, it does not depend on the plotted data: part of its value is that it remains constant across all plots for adults of a given sex.

Read More… Download PDF

Syllabic Clusters

This document describes the process by which the SpeechMark syllabic cluster analysis operates to group previously computed landmarks. The grouping algorithms were developed to deal with English-focused infant speech including babble—that is, speech whose intended lexical content is unknown (if it exists).

Sequences that would be transcribed as an infant attempt at a speech syllabic cluster were identified, and empirical rules for separating these from the speech stream and from each other were developed based on landmark sequences and timing.

It is important to remember that the syllabic cluster rules so developed are sensitive only to the speech AS UTTERED. They may or may not match syllabic clusters of speech as analyzed by transcription.

Read More… Download PDF