Syllabus

Lectures

Homework and Exams

Piazza

Compass

Lab 7

In this machine problem, you will design an LSTM by hand to perform a specified task. Then you will also train it, using gradient descent, to perform the same task.

Useful Files

Task Description

You have a dataset object with input observations, x[t] (loaded as self.observations[t]), and target outputs y[t] (loaded as self.label[t]). The LSTM should create an activation matrix, self.activation. The last column of self.activation (self.activation[t,4]) contains the LSTM output, h[t].

Your task is to create an LSTM that will perform the following task. For every time step, t,

Training Epochs

You'll be tested for epochs -1, 0, 50, and 100. If you run visualize.py, it will run epochs -1 through 140, then print an error convergence curve.

In all three cases, you should perform one update of gradient descent training. (Hint: if you perform the task perfectly, the error should be zero, and its gradient should also be zero).

LSTM Definition

We'll use an LSTM defined exactly as in lecture (and on Wikipedia), except that (1) the cell nonlinearity (sigma_h) is the same as the gate nonlinearity (sigma_g), and (2) the error is the mean-squared-error, instead of the sum-squared-error. Thus:

	    c[t] = f[t]*c[t-1] + i[t]*g(wc*x[t]+uc*h[t-1]+bc)
	    i[t] = g(wi*x[t]+ui*h[t-1]+bi)
	    f[t] = g(wf*x[t]+uf*h[t-1]+bf)
	    o[t] = g(wo*x[t]+uo*h[t-1]+bo)
	    h[t] = o[t]*c[t]
	  
and
	    error = 0.5*np.sum(np.square(self.activation[:,4]-self.label))
	  
where
	    self.model = np.array([[bc,wc,uc],[bi,wi,ui],[bf,wf,uf],[bo,wo,uo]])
	    self.activation[t,:] = c[t], i[t], f[t], o[t], h[t]
	  
For epoch==-1 (knowledge-based design), use the CReLU activation function, g(x) = max(0,min(1,x)), and limit the weights to [-1,1]. For epoch >= 0 (gradient descent), use the logistic activation function: g(x) = 1/(1+exp(-x)), and the weight values are not limited. These two activation functions are provided for you in the function self.activation(x), and their derivatives are provided in the function self.derivative().

Files included in the distribution

What to submit:

The file submitted.py, containing all of the functions that you have written.