Praat scripts (for acoustic analysis)
Praat scripting is a way to automate measurements in Praat that are otherwise done manually. Everything that a Praat script can do can also be done manually. Some of benefits of learning and using Praat scripts include: automating acoustic measurements; saving a lot of time depending on the size of the task; and consistency, accuracy, and control over measurements. There are also a few things to be aware of when using Praat scripts or any other automation technique. For example, the same formant settings may not be ideal for every speaker or even within utterances from a single speaker due to physiology and quality of the recordings. You also must be careful in checking boundaries made using a TextGrid especially when relying on force alignment. After taking automated measurements, it is a good idea to check your output data and make hand corrections as necessary.
Our basic Praat script is for making measurements of a sound file, at time points indicated in a TextGrid. As a model, we will use get_formants.praat, and point out all of the Praat scripting concepts involved, and talk about how the script might be modified. Line numbers refer to lines shown in some text editors (such as vi, gedit, and Kate), or in this numbered version and this color-coded version of the file (which won’t run as a Praat script).
get_formants.praat follows a typical structure for a Praat script:
- set parameters (such as file names for tracking formants)
- open files (or just select objects that are already open) and remove them when finished
- read a textgrid (to choose where to make measurements)
- generate an object to measure (in this case, a Formant object)
- use a for loop to go through the potential measurement points and an if statement to screen them
- make the measurements
- write the results to a text file
- COMMENTS (ignored by Praat)
Concepts for understanding get_formants.praat
Two important types of variables in Praat scripts are string variables (for text) and numeric variables (for numbers). Line 30 creates a string variable named phone$ and sets its value to the label of a TextGrid interval (which is a text string). Line 25 creates a numeric variable named intervals and sets its value to the number of intervals in tier 1 of the TextGrid (which is a number). In the praat scripting language, variable names always start with lowercase letters, and string variable names end with the dollar sign ($). Lines 51 and 53 both give a value to a string variable, one by directly giving a value (in double quotes), and the other by using a function that reads a TextGrid label:
left$ = "#" left$ = Get label of interval... 1 i-1
After a string variable has been defined, it can be used interchangeably with its value, i.e., these two lines are equivalent, if left$ has already been defined as #:
newstring$ = left$ newstring$ = "#"
Some functions expect strings to be entered directly, i.e., if you have a Sound object called myrecording in your object list, you can select it like this:
select Sound myrecording
If you have already set the value of the string variable sound_file$ to “myrecording”, then this (line 20) is equivalent:
select Sound 'sound_file$'
The single quotes indicate that sound_file$ is the name of the variable, not the name of the Sound object. In this way, the function of single quotes is the opposite of the function of double quotes.
The Praat manual (Help… Praat Intro) contains a lot of information about things you can do with variables. To see some of them, open the manual and search for “numeric” or “string”. Lines 36-40 show some simple arithmetic with numeric variables.
Forms are a convenient way of giving information to the script. When the script is run, the commands on lines 10-14 will cause a window to open and prompt the user for the name of the sound file, the two formant tracking parameters:
form Measure formant values for segments in a textgrid sentence sound_file myname positive maximum_formant 5500 positive number_of_formants 5 endform
The first line indicates that this is a form, and specifies what text is to appear in the window. The next three lines indicate three variables that we will define and give values to, specifying the type of variable (“sentence” for strings, “positive” for positive numbers), the names of the variables, and their default values. Note that the string variable name doesn’t have $ at the end, but when we refer to it later (e.g., in line 20) we will use the $. Also, we don’t use quotes for text strings, because Praat is expecting text. An alternative to a form would be to specify the values directly in the script, and change them manually when we run it, like this (note the differences involving $ and quotes):
sound_file$ = "myname" maximum_formant = 5500 number_of_formants = 5
One of the valuable things about scripts is that we can repeat an action an arbitrarily large number of times. The device that allows us to repeat something as many times as necessary is the for loop. In the script, everything from line 28 to line 75 is inside a for loop.
for i from 2 to intervals-1 ... endfor
To make a for loop, we need to define a starting point and an end point, and a numeric variable that will take each value in between. In this case, we are calling the numerical variable i, and it will count up from 2 until it gets to the number of the second-to-last interval in the first tier TextGrid (because the variable intervals was defined on line 25). The lines of code inside the loop will run once for every value that i takes. Since these lines of code are for measuring formants, and since tier 1 is the “phones” tier, that means we will measure formants for every segment in the textgrid.
An if statement is similar to a for loop. A for loop determines how many times, and for what values of a variable, the enclosed lines of code will execute. An if statement determines whether the enclosed lines of code will run at all. In the script, everything from line 33 to line 74 is inside an if statement:
if phone$ != "" and phone$ != "sp" ... endif
phone$ has been defined on line 30 as the label of the ith phone interval. The if statement on line 33 causes the lines of code that measure formants to be executed only if the label of the phone is not empty and it’s not “sp” (the aligner’s label for short pause). != means “is not equal to”, just as = means “is equal to”. Changing line 33 to the following line would have the effect of only measuring phones labeled “AA0”:
if phone$ = "AA0"
It is conventional to indent all the lines within a loop or if statement, to make the code more readable, but Praat ignores this for the purposes of loops and if statements, because it relies on commands like endfor and endif to know when they end.
OPENING FILES AND MANAGING OBJECTS
When objects such as sounds and textgrids are in the object list, they need to be selected in order to be used. This is the same as clicking on the object in the list. Line 20 selects the sound so that it can be used to make a Formant object:
select Sound 'sound_file$'
When we are done with an object, we can select it and then remove it. This is especially useful in scripts that create thousands of small objects. The last two lines of the script remove the Formant object that was created by the script:
select Formant 'sound_file$' Remove
This script assumes that the sound file and textgrid are already open. If they weren’t already open, we could use commands in the script to open them:
Read from file... /home/jeff/'sound_file$'.wav Read from file... /home/jeff/'sound_file$'.TextGrid
The reason we write Praat scripts instead of Python scripts or R scripts is that we want to use all the features of Praat that are useful for acoustic analysis. This script makes use of Praat’s ability to generate a Formant object from a Sound object and query that formant object to find formant values at points in time. Any command that can be executed by clicking a button in the object window or an item in a dropdown menu can be executed in a script. If you wanted to make a Formant object from a Sound object manually, you would select the Sound object, and click Formants & LPC – … To Formant (burg)… and enter five values (Time step, Max. number of formants, Maximum formant, Window length, and Pre-emphasis from) in the window that opens. Line 21 does the same thing:
To Formant (burg)... 0 'number_of_formants' 'maximum_formant' 0.025 50
The five values that would have been entered in the window appear as arguments to the function in the script. In this case, we are using the default values for three arguments, and using two numeric variables that were defined using the form at the beginning of the script. The names of Praat functions are the same as the names that appear in the menus, and arguments to the functions appear in the same order as they appear in the window when you run the command manually. This makes it easy to guess how to enter a function in a script. To make it even easier, Praat has the Paste history option that is available in the Script editor window. Pasting history will insert into your script every command you ran manually since you opened Praat (or last clicked Clear history).
The script uses several Praat functions that are meant for querying TextGrids and TextGrid tiers (available under Query – whenever you are selecting a TextGrid in the object list):
Get number of intervals...(line 25, for setting up your for loop)
Get label of interval...(line 30, for identifying a segment based on its number)
Get starting point...and Get end point… (lines 35-36, for determining where an interval starts and ends so you know where to make measurements)
Get interval at time...(line 43, for getting the number of an interval when you know the time, in this case for getting the word after we know the time of phone)
Commands for querying TextGrid tiers typically take two arguments. The first is the tier number, and the second is either an interval number or a time.
Other objects are queried in the same way. To get formant values in lines 64-70, we select the Formant object and then query it using the Get value at time… command, with four arguments (which formant to measure, when to measure it, and two other parameters specific to formant measurements, which here are set to the defaults.
When we run a script to make a large number of measurements, it is useful to output the measurements to a file. Line 73 is the line that does this:
fileappend formants_'sound_file$'.txt 'word$','left$','phone$','right$','f1_1','f2_1','f1_2','f2_2','f1_3','f2_3''newline$'
The fileappend command takes two arguments: the name of a file, and a text string to add to the end of that file. In this case, the filename will be based on our sound file’s name. The first argument will be interpreted literally as the name of the file. Notice that sound_file$ has single quotes around it, to tell Praat to use the text that the variable contains, not the name of the variable. If the quotes were missing, Praat would make a file named formants_sound_file$.txt. The second argument is one long text string, made by combining many variables (the word, the phone and its immediate neighbors, and six formant measurements, all in single quotes). These are separated by commas, which are outside the single quotes, so that they will be interpreted directly as commas. ‘newline$’ is a special string variable that starts a new line at the end of the string. The result is that every time we append text to the file, it will start a new line. Another special string variable is ‘tab$’.
Often when you run a script, you want to start a new file. The way to achieve this is to delete the file before you write anything to it. Our script does this on line 17:
If the file doesn’t exist, nothing will happen. But if you accidentally give the wrong filename and that file does exist, it will be deleted.
COMMENTS AND COMMENTING OUT
It is good practice to insert comments in your script to document what it does, so that someone else (or you several months later) can more easily figure out what it’s doing. Any line of text that starts with # will be interpreted as a comment, and ignored by Praat. This is also a good way to temporarily remove a line of code from your script, which is useful when you are trying out more than one way to do something. Putting # at the beginning of a line of code (called “commenting out”) causes Praat to ignore it, but you can easily put it back in by deleting the # (“uncommenting it”).
RUNNING PART OF A SCRIPT
To run an entire script, you use Ctrl-R/Cmd-R. You can run only the selected part of a script using Ctrl-T/Cmd-T. This is useful for debugging parts of a script. Keep in mind that if you are running part of a script that depends on variables defined earlier in the script, it won’t work unless you define the variables in the part you are highlighting.
OUTPUTTING TO THE PRAAT INFO WINDOW
A useful debugging tool is to display the contents of variables in the Praat Info window using the echo command. This command will make “hello world” appear in the Praat Info window:
echo hello world
print does the same thing as echo but it doesn’t clear the window each time:
print hello world print hello again
echo and print interpret text literally, so if you want to display the value of a variable, you need single quotes:
echo 'f1_2' echo F1 at midpoint: 'f1_2'
Changes you could make to the script
- add a header row to the text file
- format the output in a different way
- make many small formant objects instead of one large one
- measure selectively (e.g., only measure the vowel /a/)
- measure different formants or measure formants at different times
- make different kinds of measurements (other than formant values)
- define procedures (for repeated operations)
- open an Editor window to approve measurements
- batch operation (to measure several files in a row)
- run the script from the command line
Other useful concepts
Some other things that are useful to know how to do.
OPENING ALL THE FILES IN A DIRECTORY
You can create a Strings object containing the names of files in a location you specify, and query that object to find out how many files there are (something you will need in order to use a for loop to open them):
path$ = "C:\where_my_files_are" Create Strings as file list... list 'path$'/*.wav numberOfFiles = Get number of strings
Now all you need to do is loop through the Strings object (which we have named list), read each filename, and open it:
for ifile to numberOfFiles select Strings list filename$ = Get string... ifile Read from file... 'path$'/'filename$'.wav sound$ = filename$ - ".wav" endfor
Notice that the string variable sound$ stores the name of each Sound object, but it is not needed for any of the commands here. But it would be useful if you ever want to refer to the most recently-opened sound or an object derived from it. See wav2mfcc.praat for an example of this.
EXTRACTING PARTS OF A SOUND OBJECT
If you have a Sound file, and a starting and ending time for an excerpt that you want, you can make a new wav file of just that part of the sound:
select Sound 'sound$' Extract part... start end rectangular 1 yes Write to WAV file... 'path$'/'sound$'_'start'_'end'.wav Remove
Remove has the effect of removing any selected objects from the list. In this case, that will be the short Sound, because it was created most recently. Notice that you will need to define four variables for this to work: sound$ (what your Sound object is named), start and end (the times in seconds of the start and end of the interval you want), and path$ (where you want to save the file). Typically you would define sound$ and path$ at the beginning of the script, and define start and end in a for loop that is reading interval boundaries from a TextGrid object. See save_word_intervals.praat for an example of this.
DOING SOMETHING WITH SELECTED OBJECTS
Another way to make a script more flexible is to allow the user to select the objects to be processed, instead of listing them all by name. These commands will make a name for each sound object that is currently selected:
numberOfSelectedSounds = numberOfSelected ("Sound") for i to numberOfSelectedSounds sound'i' = selected ("Sound", i) endfor
The individual sounds can then be processed using another for loop from i to numberOfSelectedSounds, or individually by referring to them as sound1, sound2, etc. See save_selected.praat for an example of a script for processing selected sounds and textgrids, and reading their names.
ADDING A HEADER ROW TO THE OUTPUT FILE
By adding a header row to the output file, you are adding a heading to each variable column in a csv or tab-delimited output file. The same fileappend command can work here; the file would write one single row with the variable names (separated by comma or tab) that represent the measurements to be written in the output file in the same sequence. To avoid producing the header row repeatedly inside the output file, the fileappend command should be written before the for loop starts.
Basically, adding a header row is more like adding a copy of the fileappend command line that is used to write the data into an output file. The only difference is we do not need to mark the variable here.
fileappend formants_'sound_file$'.txt 'word$','left$','phone$','right$','f1_1','f2_1','f1_2','f2_2','f1_3','f2_3''newline$' endfor
fileappend formants_'sound_file$'.txt word, left, phone, right, f1_1, f2_1, f1_2, f2_2, f1_3, f2_3 'newline$' endfor
PAUSING A SCRIPT
You can pause a script at any time in order to give an instruction to the user or prompt the user to check or adjust something. For example, this command will pause the script and ask the user to fix TextGrid boundaries (possibly because the script is about to make a measurement based on a TextGrid):
pause Please fix the TextGrid boundaries
See merge_textgrids.praat for an example of a script with a single pause. It can also be very useful to put a pause inside a for loop.