Syllabus

Lectures

Homework and Exams

Piazza

Compass

Lab 5

In lab 5, you will train a Gaussian model of each of the phonemes of Chinese and English, with models shared between the two languages whenever an IPA symbol is shared. You'll train a maximum-likelihood linear regression (MLLR) normalizer, to compensate for differences between the Chinese speaker and the English speaker.

Recommended Debugging Procedure

This lab is more compute-intensive than some previous labs, so here are some steps that might help you to get it working in a reasonable amount of time.

submitted.py
as always, this is the only file you will submit. This time, most of the code of submitted.py is provided for you. In each of the sections, there are some TODO comments, followed by lines with INCORRECT in the comment next to them.
debug.py
is the recommended way to test the first draft of your submitted.py file. Open an interactive python window (I do this by simply typing "python" in the terminal window). Then type "import debug", and then use the following line:
  • (dataset, step, solution, error) = debug.start_debugging(testcase=0, nsteps_to_run=1)
This does the following useful things:
  1. Uses importlib.reload to reload your submitted.py, in case you have made changes to your code since the last time it ran.
  2. Creates a new submitted.Dataset() object, in case you have made changes to your code since the last time it ran. This new object is returned to you as the object "dataset," so, for example, you can get information about it by typing help(dataset), and you can see the loaded IPA transcription by typing print(dataset.tokens).
  3. Runs nsteps_to_run of the steps. If you start with nsteps_to_run=1, then it will just try to run the first step (set_sat). After that runs, you can see the result by typing dataset.sat.
  4. If a reference solution is available for your specified testcase, then it will be loaded, and given to you as the object called "solution". So, for example, you can see the difference between your solution and the reference solution by typing "np.absolute(dataset.sat-solution['sat'])".
  5. Furthermore, in the very last step run (the one you specified using nsteps_to_run), the difference between the reference solution versus your solution will be given to you as the ndarray "error." So you can see the same thing as the previous bullet point by typing "np.absolute(error)".
The recommended procedure is that you run debug.start_debugging until you get to the first step that needs to be written. Then, interactively cut and paste lines from submitted.py into your interactive window, always replacing "self" with "dataset", until you get to the line that you need to write. Now you can just experiment with different numpy functions, to find the one that gives you the solution that you want.
visualize.py
this is the function that was called make_cool_plots.py in labs 2-4. After debug makes it all the way through your code, you can run this program to make the desired plots. It can be run from your interactive python window, for example, by typing the lines
  • import visualize
  • visualize.Plotter('vis').make_plots(iteration=0)
...which will generate, in the 'vis' directory, all of the plots for testcase 0. You can compare those to the plots provided in solutions/testcase0_*.png.
run_tests.py
finally, before you upload your code to gradescope, you can validate it using run_tests.py as usual.

How to Submit

When you're ready to submit, go to Gradescope.

  • Submit only the one file, submitted.py. Any other files you submit will be ignored.
  • You may submit as many times as you like, until the deadline. Only your last submission will count toward your course grade.