Neural networks

In WebPPL neural networks can be represented as simple parameterized functions. The language includes a number of helper functions that capture common patterns in the shape of these functions. These helpers typically take a name and the desired input and output dimensions of the network as arguments. For example:

var net = affine('net', {in: 3, out: 5});
var out = net(ones([3, 1])); // dims(out) == [5, 1]

Larger networks are built with ordinary function composition. The stack() helper provides a convenient way of composing multiple layers:

var mlp = stack([
  sigmoid,
  affine('layer2', {in: 5, out: 1}),
  tanh,
  affine('layer1', {in: 5, out: 5})
]);

It’s important to note that the parameters of these functions are created when the constructor function (e.g. affine()) is called. As a consequence, models should be written such that constructors are called on every evaluation of the model. If a constructor is instead called only once before Infer or Optimize is called, then the parameters of the network will not be optimized.

// Correct
var model = function() {
  var net = affine('net', opts);
  /* use net */
};
Infer({model: model, /* options */});

// Incorrect
var net = affine('net', opts);
var model = function() {
  /* use net */
};
Infer({model: model, /* options */});

Feed forward

affine(name, {in, out[, param, init, initb]})

Returns a parameterized function of a single argument that performs an affine transform of its input. This function maps a vector of length in to a vector of length out.

By default, the weight and bias parameters are created using the param() method. An alternative method (e.g. modelParam()) can be specified using the param option.

The init option can be used to specify how the weight matrix is initialized. It accepts a function that takes the shape of the matrix as its argument and returns a matrix of that shape. When the init option is omitted Xavier initialization is used.

The initb argument specifies the value with which each element of the bias vector is initialized. The default is 0.

Example usage:

var init = function(dims) {
  return idMatrix(dims[0]);
};
var net = affine('net', {in: 10, out: 10, init: init, initb: -1});
var output = net(input);

Recurrent

These functions return a parameterized function of two arguments that maps a state vector of length hdim and an input vector of length xdim to a new state vector. Each application of this function computes a single step of a recurrent network.

rnn(name, {hdim, xdim, [, param, output]})

Implements a vanilla RNN. By default the new state vector is passed through the tanh function before it is returned. The output option can be used to specify an alternative output function.

gru(name, {hdim, xdim, [, param]})

Implements a gated recurrent unit. This is similar to the variant described in Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.

lstm(name, {hdim, xdim, [, param]})

Implements a long short term memory. This is similar to the variant described in Generating sequences with recurrent neural networks. The difference is that here there are no peep-hole connections. i.e. The previous memory state is not passed as input to the forget, input, or output gates.

Nonlinear functions

Some nonlinear functions commonly used when building networks. Each is applied element-wise to its argument.

sigmoid(tensor)
tanh(tensor)
relu(tensor)
softplus(tensor)
softmax(tensor)

Other

stack(fns)

Returns the composition of the array of functions fns. The composite function applies the functions in fns in reverse order. That is:

stack([g, f]) == function(x) { return g(f(x)); }