Inspecting gradients in Chainer

Chainer is my choice of framework when it comes to implementing Neural Networks. It makes working with and trouble shooting deep learning easy.

Printing out the gradients during back propagation to inspect their values is sometimes useful in deep learning, to see if your gradients are as expected and aren’t either exploding (numbers too large) or vanishing (numbers too small). Fortunately, this is easy to do in Chainer.

Chainer provides access to the parameters in your model, and for each parameter, you can check the gradient during the back propagation step, stored in the optimizer (such as SGD or Adam). To access these, you can extend chainer.training.updaters.StandardUpdater() to additionally output the gradients, by defining your own StandardUpdater like so:

class CustomStandardUpdater(chainer.training.updaters.StandardUpdater):
    def __init__(self, train_iter, optimizer, device):
        super(CustomStandardUpdater, self).__init__(
            train_iter, optimizer, device=device)

    def update_core(self):
        super(CustomStandardUpdater, self).update_core()
        optimizer = self.get_optimizer('main')
        for name, param in optimizer.target.namedparams(include_uninit=False):
            print(name, param.grad)

In lines 9-10 you can see the parameters (weights) of your neural network being accessed through the optimizer, and for each parameter, the name and gradient is being output. This StandardUpdater can be attached to your training module as follows:

model = MyChainerModel()
optimizer.setup(model)
optimizer = chainer.optimizers.Adam()
train_iter = chainer.iterators.SerialIterator(train_dataset, batch_size=32, shuffle=True)
updater = CustomStandardUpdater(train_iter, optimizer, gpu)
trainer = training.Trainer(updater, stop_trigger=(100, 'epoch'))
trainer.run()

Leave a Reply

Your email address will not be published. Required fields are marked *