Sample workflow


This page describes how to use diarization, transcription, forced alignment, and one_script to apply measurements to any speech recording.

Putting it all together

In all of these examples, change “myfile” to the actual names of the files we are working with and yourunityid to your actual unity id. First paste these commands into your text file, and then edit them. Then you will have a repeatable set of commands to use in the future.

Upload a sound file to your ENG 536 directory by running this command in a terminal on your local computer:

scp myfile.wav

Now open an ssh connection to a phonetics lab computer and use that connection for the next steps:


Run these commands on the remote lab computer computer to diarize and transcribe the speech in the recording:

conda activate pyannote
python /phon/vosk/ --input myfile.wav

Organize the files for forced alignment. Here we are creating a directory, copying the wav file into it, and copying the textgrid file into it with a new name that matches the wav file’s name except for the .TextGrid extension. The input for forced alignment is a directory containing one or more wav files and one or more textgrid files with matching filenames.

mkdir mycorpus
cp F18_like_excerpt.wav mycorpus
cp F18_like_excerpt_dt.TextGrid mycorpus/myfile.TextGrid

Align the transcript to the speech recording:

conda activate aligner
mfa align --clean mycorpus ral_mfa english_us_arpa mycorpus_output

Use Praat  and one_script to measure the formants in the recording (note that the wav file is in the mycorpus directory and the textgrid is in the mycorpus output directory):

praat /phon/scripts/one_script.praat '/phon/ENG536/yourunityid/mycorpus/myfile.wav;/phon/ENG523/yourunityid/mycorpus_output/myfile.TextGrid' 'VOWEL VL' 'formants()' 'l'

Finally download the measurements to your computer by running this command in a terminal on your local computer (changing “date_and_time” to the actual date and time stamp in your measurement file’s name:

scp ./

These are all of the steps we can automate. When using these tools in a real project, you would make manual corrections. After diarization and transcription you would download the transcript textgrid, correct it, upload the corrected version, and use that as the input to forced alignment. You may or may not manually correct the forced aligner’s segmentation before making measurements. After measurement, you will want to plot the measurements and then you may or may not decide to modify the measurement settings, manually correct some measurements, or make corrections at earlier stages and rerun subsequent stages.