WaveSurfer and SpeechMark Configuration

WaveSurfer makes use of text configuration files to allow the user to specify and automatically set up and reuse specific configurations of panes for purposes of speech analysis, recording, labeling, and so forth. WaveSurfer comes with a handful of predefined configurations, and users can easily define and use their own custom configurations.

Read More… Download PDF

SpeechMark family of products

The SpeechMark family of products is designed to detect acoustic landmarks in speech recordings. Landmarks are acoustic events that correlate with changes in speech articulation. The SpeechMark family comprises plug-ins that augment the capabilities of existing third-party software, as well as stand-alone libraries and command line utilities.

How are Acoustic Landmarks Detected?

The landmark detection process begins by analyzing the signal in several broad frequency bands. Because of the different vocal-tract dimensions, the appropriate frequencies for the bands are different for adults and infants; however, the procedure itself does not vary. First, an energy waveform is constructed in each of the bands. Then the rate of rise (or fall) of the energy is computed, and peaks in the rate are detected. These peaks therefore represent times of abrupt spectral change in the bands. simultaneous peaks in several bands identify consonantal landmarks.

What are Acoustic Landmarks?

In a significant body of work spanning several decades, Stevens and colleagues suggested that the speech signal can be usefully analyzed in terms of landmarks—that is, acoustic events that correlate with changes in speech articulation [1]. Most research using the landmark approach has focused on the lexical content of speech [2][3]. In our work [4][5] , we have found that tools based on landmarks can be useful for investigating non-lexical attributes of speech, such as syllabic complexity or vowel space area over time. In particular, we have found that landmark-based software tools are well suited for analysis of subtle differences in production of the same speech material by the same speaker.

How are Acoustic Landmarks Detected?

The landmark detection process begins by analyzing the signal in several broad frequency bands. Because of the different vocal-tract dimensions, the appropriate frequencies for the bands are different for adults and infants; however, the procedure itself does not vary. First, an energy waveform is constructed in each of the bands. Then the rate of rise (or fall) of the energy is computed, and peaks in the rate are detected. These peaks therefore represent times of abrupt spectral change in the bands. Simultaneous peaks in several bands identify consonantal landmarks.

[1] Stevens, K.N., et al. “Implementation of a Model for Lexical Access based on Features”, in International Conference on Spoken Language Processing (ICSLP) Proc., 1992.
[2] Juneja, A. and C.Y. Espy-Wilson. “Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines”, in International Joint Conference on Neural Networks Proc., 2003.
[3] Slifka, J.S., et al. “A Landmark-Based Model of Speech Perception: History and Recent Developments”, in From Sound to Sense: Fifty Years of Speech Research, 2004.

Marking Adult Vowel-Space Formant Boundaries

The usual SpeechMark® vowel-space plot for adults includes a polygon that marks the boundaries of typical formant-frequency (F1, F2) pairs for normal adult speakers. The boundary drawn depends on whether the sex of the actual speaker has been specified as male, female, or unknown. The polygon is intended solely as a “fiducial” reference (an aid to the eye) much like grid lines. Like grid lines, it does not depend on the plotted data: part of its value is that it remains constant across all plots for adults of a given sex.

Read More… Download PDF

Syllabic Clusters

This document describes the process by which the SpeechMark syllabic cluster analysis operates to group previously computed landmarks. The grouping algorithms were developed to deal with English-focused infant speech including babble—that is, speech whose intended lexical content is unknown (if it exists).

Sequences that would be transcribed as an infant attempt at a speech syllabic cluster were identified, and empirical rules for separating these from the speech stream and from each other were developed based on landmark sequences and timing.

It is important to remember that the syllabic cluster rules so developed are sensitive only to the speech AS UTTERED. They may or may not match syllabic clusters of speech as analyzed by transcription.

Read More… Download PDF

The SpeechMark Command Line Utility, Mac Edition

———————————————————————————
SpeechMark Product: The SpeechMark Command Line Utility, Mac Edition
Operating Systems Supported: Mac OSX (Lion or Mountain Lion versions)
Product Version: 0.1.4
Public Release Date : April 7, 2013
———————————————————————————

Installation Notes — PLEASE READ
===============================

    1. During operation, this SpeechMark product calls the MATLAB Compiler Runtime (MCR) product, a software module developed and released by The Mathworks. The installation process for this SpeechMark product searches the user’s file system. If it does NOT find an invocable copy of the MCR, it automatically downloads an MCR install image from the SpeechMark web site, and attempts to install it. Following MCR installation, the SpeechMark installation process guides the user to exit the install process, and begin the installation of this SpeechMark product a second time. Once the MCR has been successfully installed, the SpeechMark installation process should run to completion.
    2. We are currently requiring that Mac end users install the MCR in the default location, i.e. /Applications/Matlab/Matlab_Compiler_Runtime/v717/. Since Macs don’t have an analog of the Windows Path that is both readily accessible and reliable for installed software, we’re using this workaround until we come up with a more sophisticated solution. A better solution might be as simple as prompting the user for the MCR folder as part of the SpeechMark installation process.

 

———————————————————————————
Bug Fixes:

    1. Help option “-?” changed to “-h”
    2. Examples directory, e.g. “/Users/[user]/Documents/SpeechMark_Example_Files” given write permission.
    3. MATLAB path bug fixed for BUILD script.

———————————————————————————
Enhancements:
(None for this release)

———————————————————————————
Known Bugs:

  1. The input waveform checksum that is supposed to be calculated and output in the landmark file header is not in fact calculated or output.

 

The SpeechMark Command Line Utility, Windows Edition

SpeechMark Product Release Notes

SpeechMark Product: The SpeechMark Command Line Utility, Windows Edition
Operating Systems Supported: Windows XP, Windows 7
Product Version: 1.1.2
Public Release Date : October 23, 2019

———————————————————————————
Installation Notes–PLEASE READ
===============================

1) During operation, this SpeechMark product calls the MATLAB Compiler Runtime (MCR) product, a software module developed and released by The Mathworks. The installation process for this SpeechMark product searches the user’s file system. If it does NOT find an invocable copy of the MCR, it automatically downloads an MCR install image from the SpeechMark web site, and attempts to install it. Following MCR installation, the SpeechMark installation process guides the user to exit the install process, and begin the installation of this SpeechMark product a second time. Once the MCR has been successfully installed, the SpeechMark installation process should run to completion.

2) Both the SpeechMark and MCR installation processes MUST be run with administrative privileges. If the user’s account does not have administrative privileges active, The installation process must be invoked with administrative privileges. To do this, hold down the SHIFT key, and click on the installation icon or file name. Then select “Run as…” from the context menu that appears. Then enter the name and password of a user who has administrative privileges.

———————————————————————————
Bug Fixes:
1) Workaround for MCR installation.
2) Resolve administrator privileges issues.
3) Resolved an issue when voicing continued past the end of the data
4) Resolved an issue where either voicing was very faint or only one landmark was found caused the results to be invalid

———————————————————————————
Enhancements:

(None for this release)

———————————————————————————
Known Bugs:
1) The input waveform checksum that is supposed to be calculated and output in the landmark file header is not in fact calculated or output.

2) When the user includes the “/g” option on the command line, there MUST be a space between the “/g” and the speaker type character (e.g., “M”). If no space is included, the option is silently ignored.

The SpeechMark R Package, Windows Edition

———————————————————————————
SpeechMark Product: The SpeechMark R Package, Windows Edition
Operating Systems Supported: Windows XP, Windows 7
Product Version: 0.1.5
Public Release Date : February 7, 2014
———————————————————————————
Installation Notes–PLEASE READ
===============================

  1. Verify that the MATLAB Compiler Runtime (“MCR”) 32 bit version 2013b (v82) has been installed on your computer. If it hasn’t, install this software first! The MCR install image can be downloaded from:http://www.mathworks.com/products/compiler/mcr/ See IMPORTANT MCR INSTALATION NOTES below.
  2. Verify that the 32 bit version of R has been installed on your computer (the latest version is R i386 3.0.2). The R install image can be downloaded from: http://www.r-project.org/
  3. Download the SpeechMark R Package install image from the SpeechMark web site. The install image is a zip file that contains installation instructions (basically, the text you are now reading) and another zip file named “SpeechMark.zip”. Extract these files from the download image.
  4. Install the SpeechMark R Package in your R environment by launching the R GUI, and clicking on the “Packages” menu on the menu bar. Select the “Install package from local zip file…” menu option. A “Select files” window will appear. Use it to find and select the SpeechMark.zip file that you extracted in the previous step.
  5. Load the SpeechMark R Package by clicking on the “Packages” menu again on the R Menu bar. This time select the “Load package …” menu option, and select the “SpeechMark” package from the list of installed packages that appears.
  6. You can view the SpeechMark Package documentation by typing the command “help(package=’SpeechMark’)” at the command line prompt in the R Console Window.

IMPORTANT MCR INSTALLATION NOTES
================================
Many of our users install the MATLAB Compiler Runtime without encountering any problems. Others are not so fortunate. These notes are intended to shed some light on potential MCR Installation issues and problems. Ideally you’ll read and comprehend these notes before attempting the MCR installation the first time. But at the very least, please read them over if you experience any problems.
For many users of the SpeechMark R Package who do run into trouble, their first hint that something has gone wrong is an error message when they attempt to LOAD the SpeechMark R Package. This is because a working MCR is not needed in order to install R or install the SpeechMark R Package, but it is needed in order to load that package. If Step 5 above fails, you need to read these instructions. Usually, when Step 5 fails, R displays an error box with a message like this:

    “This application has failed to start because mclmcrrt8_2.dll was not found. Re-installing the application may fix this problem.”

If you see this message, help is on the way. Keep reading!

  1. The SpeechMark R Package uses compiled MATLAB code. So may other applications you have installed. A number of different versions of the MCR are available from the MATLAB web site. In order to run this code, you need to install on your computer the *particular* version of the MCR required by the code. In order to run this package, you must download and install the MCR version that this package needs, even if you have installed other versions of the MCR required by other applications. You can have multiple versions of the MCR installed and in use on your computer simultaneously.
  2. The SpeechMark R Package requires the MATLAB Compiler Runtime (“MCR”) 32 bit version 2013b (v82). Other versions of the MCR will not work. Follow the link provided in Step 1 above, and be careful to choose, download, and install the correct version.
  3. It is not sufficient to install the correct version of the MCR–your PATH environment variable must also be updated to include the directory in which the MCR DLL is located. Some users find that the MCR installation process updates their PATH variable automatically. Other users find that they must update their PATH variable by hand.
  4. The directory (folder) that must appear in your PATH is the folder in which the MCR DLL file “mclmcrrt8_2.dll” resides. This should be a subfolder under the folder that you installed the MCR in, and by default ends with “…\v82\runtime\win32”. If necessary, you can find this folder by using the Windows Explorer Search function. Search for the file “mclmcrrt8_2.dll”.
  5. The filespec of this folder must be added to the PATH environment variable, either by the MCR install process or by you, manually. To check to see if this folder has been added to PATH by the MCR install process, open a Windows Command Window, and type “Path” and press Enter. The current PATH environment variable will be printed out.
  6. If the MCR DLL folder has not been added to the path, you must add it manually. Press the Start button, select the Control Panel, click on the System icon, select the Advanced tab, click on the Environment Variables button, and edit the System Variable named “path”.
  7. If you do not have the full MATLAB application installed (that is, if you are not a MATLAB user), append the MCR DLL directory to the end of your PATH string, separated by a “;” character from the previous folders. If you do have the MATLAB application installed, you must insert the MCR DLL before any of the MATLAB application directories that appear in the PATH string. Separate the MCR DLL from the other folders with “;” characters.
  8. For the revised PATH string to be visible to R, a new R GUI must be launched *after* the PATH variable has been updated and saved.

Please read the MCR Installer Documentation, available from the MATLAB web site by going to the URL shown in Step 1, and looking for the “MCCR Installer Documentation” link.

———————————————————————————
Bug Fixes:
None

———————————————————————————
Enhancements:
Renamed the package from “starlm” to “SpeechMark”

———————————————————————————
Known Bugs:
None

The SpeechMark WaveSurfer Plug-In, Mac Edition

———————————————————————————
SpeechMark Product: The SpeechMark WaveSurfer Plug-In, Mac Edition
Product Version: 0.1.23
Public Release Date : January 14, 2013
———————————————————————————

Installation Notes — PLEASE READ
===============================

    1. During operation, this SpeechMark product calls the MATLAB Compiler Runtime (MCR) product, a software module developed and released by The Mathworks. The installation process for this SpeechMark product searches the user’s file system. If it does NOT find an invocable copy of the MCR, it automatically downloads an MCR install image from the SpeechMark web site, and attempts to install it. Following MCR installation, the SpeechMark installation process guides the user to exit the install process, and begin the installation of this SpeechMark product a second time. Once the MCR has been successfully installed, the SpeechMark installation process should run to completion.

 

    1. We are currently requiring that Mac end users install the MCR in the default location, i.e. /Applications/Matlab/Matlab_Compiler_Runtime/v84/. Since Macs don’t have an analog of the Windows Path that is both readily accessible and reliable for installed software, we’re using this workaround until we come up with a more sophisticated solution. A better solution might be as simple as prompting the user for the MCR folder as part of the SpeechMark installation process.

 

———————————————————————————
Bug Fixes:

    1. We eliminated the excessive plotting time previously required for “Band Energies” plot.

 

———————————————————————————
Enhancements:

    1. We now display MATLAB and MATLAB conversion layer errors in the command window.
    2. We now provide helpful feedback when an attempt is made to find acoustic landmarks in multi-channel signals. (The SpeechMark WaveSurfer Plugin only handles single channel data.)
    3. We now provide helpful feedback when an attempt is made to find acoustic landmarks in a signal larger that 10MB (~5 minutes at 16 bit, 16kHz sampling rate). (This is the upper limit on the size of signal files that The SpeechMark WaveSurfer Plugin can handle.)
    4. The User can customize the display of landmark “flags” via the “Landmark Configuration” popup window.
    5. Custom landmark “flag” display settings can be saved and applied via configuration files, e.g. “Save Configuration” and “Apply Configuration”.
    6. Frequency band information is now displayed as labels on the Band Energy plots.
    7. Landmark “flags” automatically resize when the landmark pane is resized.

 

———————————————————————————
Known Bugs:

  1. During installation, if the MCR has just been downloaded and installed, system path to MCR may not be updated, and installer will abort (cleanly). WORKAROUND: Rerun installer.
  2. Configuration files should not be saved while Band Energy panes are open. Configurations saved with Band Energy panes open will produce blank panes upon application.

———————————————————————————