The SpeechMark MATLAB Toolbox, Cross-Platform Edition

SpeechMark Product Release Notes

SpeechMark Product: The SpeechMark MATLAB Toolbox, Cross-Platform Edition
Operating Systems Supported: All supporting MATLAB v.2017b and later
Product Version: 1.4
Public Release Date : 2023-8-24

Installation Notes–PLEASE READ

1) This product is a standard MATLAB toolbox. To use it, a valid instance of MATLAB (version R2017b or newer) must be installed, as well as a valid version of the MATLAB Signal Processing Toolbox. A valid version of the MATLAB Image Processing toolbox, if present, will be used for certain performance enhancements but is not required.
2) This is the first version that does not work with MATLAB versions earlier than 2017b.
3) This version includes all code and documentation for v1.4 of 2023-8-24.

Resolved Problems:
1) Problem with large memory usage in ‘ficative_lms’ fixed.
2) Bug fixes in ‘mat_lminfo’ and ‘lm_vowelspace’.
1) White backgrounds and reversed colors
2) Labels for point vowels in the vowel quadrilateral
3) Version number is written to .lmmat (MAT-format binary) files by the mat_* functions.
Other changes:
1) As of this version, SpeechMark no longer supports MATLAB versions 2017a and earlier.
2) Installer does not require the SpeechMark folder to be named ‘Contents’.
Known Bugs:
1) xl_* functions do not support tables.
2) MATLAB system function BUILDDOCSEARCHDB is not used for SpeechMark documentation.

Improving the Accuracy of Automatic Detection of Emotions From Speech

Reza Asadi, Harriet Fell
Computers that can recognize human emotions could react appropriately to a user’s needs and provide more human like interactions.

Some of the applications of emotion recognition:

  • Diagnostic tool for medical purposes
  • Onboard car driving systems to keep the driver alert if stress is detected[1]
  • Similar system in aircraft cockpits
  • Online tutoring

Our contributions:

  • Use new combinations of acoustic feature sets to improve the performance of emotion recognition from speech
  • Provide a comparison of feature sets for detecting different emotions

Read More… Download PDF

Measurement of Child Speech Complexity Using Acoustic Landmark Detection

Keiko Ishikawa, Marepalli B. Rao, Suzanne Boyce
Dysphonia negatively affects speech intelligibility especially in the presence of background noise; however, no clinical tool exists to measure this deficit. Landmark (LM) analysis may serve as the basis of such tool.

The analysis identifies characteristic patterns of abrupt changes in the speech signal over time, and assigns them particular “landmarks.” Consequently, it describes speech as a sequence of LMs.

Read More… Download PDF

Measurement of Child Speech Complexity Using Acoustic Landmark Detection

An important measure of intelligibility in young children is the ability to articulate complex syllables1-4. The development of well-formed syllables in infancy has been shown to be a significant predictor of later communication skills. 1-4 Children with delayed speech acquisition do not show this same developmental trend, and deviations in syllable acquisition may serve as a diagnostic marker of future speech delay.

Read More… Download PDF

Deep Brain Stimulation May Contribute to Dysarthria in Patients with Parkinson’s Disease as Detected by Objective Measures

Craig Van Horne M.D Ph.D, Joel MacAuslan Ph.D, Karen Chenausky M.S CF-SLP, Carla Massari
Dysarthria is found in approximately 80% of patients with Parkinson’s Disease (PD) and significantly limits communication as the severity worsens. Surgical implantation of deep brain stimulators (DBS) into the subthalamic nucleus (STN) has become more common and is an effective treatment for the motoric symptoms of PD. However, the effect of DBS on speech is equivocal.

We have developed computer algorithms that quickly and objectively analyze the speech of PD patients, allowing clinicians to assess the effect of speech on DBS programming or other therapies.

Read More… Download PDF

Spontaneous Vocalization Change in Infants with Severe Impairments using visiBabble

Harriet Fell, Joel MacAuslan, Cynthia J. Cress, Cara Stoll, Kara Medeiros, Jennifer Rosacker, Emily Kurz, Jenna Beckman
Children with difficulty producing speech sounds can practice sounds in play, even prelinguistically. visiBabble is a prototype computer-based program that responds with customized animations to targeted types of infant vocalizations. The program automatically recognizes acoustic-phonetic characteristics of the vocalizations and can selectively respond to utterances with varying levels of complexity (e.g. multisyllable utterances).

This poster reports syllable production changes of three children with physical and speech impairments, ages 1-4, in response to visiBabble reinforcement. Results include immediate effects of visiBabble reinforcement on infant vocalizations as well as longer-term effects of home visiBabble practice on spontaneous sound production.

Read More… Download PDF

Landmark-based Analysis of Sleep-Deprived Speech

Suzanne Boyce, Joel MacAuslan, Ann Bradlow, Rajka Smiljanic
There is a common perception that speech articulation becomes “slurred”, or less precisely articulated, under sleep deprivation
conditions. There have been few studies of speech under sleep deprivation. Morris et al. (1960) and Harrison & Horne (1997) found that listeners heard a difference between speech recorded under rested and sleep-deprived conditions.

Read More… Download PDF

A Platform for Automated Acoustic Analysis for Assistive Technology

Harriet Fell, Lorin Wilde, Suzanne Boyce, Keshi Dai, Joel MacAuslan
While physical, neurological, oral/motor, and cognitive impairments can all significantly impact speech, people with disabilities may still be best able to communicate with computers through vocalization.

Aspects of vocal articulation are highly sensitive markers for many neurological conditions. As a source of data, recordings are

  • non-invasive,
  • inexpensive to collect, and
  • easily integrated into existing research and clinical protocols.

Read More… Download PDF

Objective Data on Clear Speech: Does it Help in Training Audiology Students?

Boyce, S. E., Balvalli, S. N., MacAuslan, J., Clark, J. C., Martin, D.
Typical speakers instinctively use a ‚CLEAR‛ speaking style when they are instructed to ‚speak as if your listener is hearing impaired‛ or ‚speak as if your listener is not a native speaker of your language‛. CLEAR speech is more intelligible to hearing impaired listeners by about 17% (1, 2, 3). The ability to automatically detect differences between a speaker’s ordinary speech patterns and their most intelligible speech, would clearly be helpful in clinical training and telemedicine applications.

Here we describe, a Landmark-based computer program (4, 5) to detect articulatory differences between ‚CLEAR‛ and ‚CONVERSATIONAL‛ styles of speech. Landmark-based speech analysis takes advantage of the fact that important articulatory events, like voicing, frication etc. show characteristic patterns of abrupt change in the speech signal. These patterns are detected by an automated computer system and assigned to a particular type of Landmark.

Read More… Download PDF

Using Landmark Detection to Measure Effective Clear Speech (2005)

Suzanne E. Boyce, Jean Krause, Sarah Hamilton, Rajka Smiljanic, Ann Bradlow, Ahmed Rivera Campos, Joel MacAuslan
A number of studies have established that normal native speakers of a language know how to improve their intelligibility to listeners under intelligibility-challenging conditions. (Uchanski, 2005).

This “Clear Speech” speaking style is significantly more intelligible to listeners; the average Clear Speech benefit is 15-17% to normal-hearing listeners in noise and to hearing impaired listeners in quiet (Uchanski, 2005).

Read More… Download PDF