Junkbox Raider: Neural networks part 1: Teaching Canyonero to drive

Artificial neural networks (ANNs) are modeled after natural neural networks (brains and nervous systems) and though they don't work exactly alike, both a brain and an ANN can learn arbitrarily complex tasks without being told exactly how - they just need data about the task and their performance.

A generic artificial neural network.

ANNs have been applied to a lot of artificial intelligence and machine learning problems, from autonomous vehicle driving to recognizing handwritten address on envelopes to creating artificial intelligence for video game agents.

I won't go deep into the math behind ANNs here; there are great sites on the web (and it's not really difficult, there's just a lot of bookkeeping).

Instead, I'll take two posts to describe a couple of neural net projects I've worked on. First up: a mobile robot called Canyonero that learned to compensate for its own mismatched wheels.

Canyonero, with a camera in the front and a netbook running an ANN.

Canyonero, The Robot With A Limp
For a cognitive robotics class project, we set up a simple differential drive robot chassis we named Canyonero, after the world's best SUV, "the country-fried truck endorsed by a clown".

Poor Canyonero had much smaller wheels on the left than on the right, so it veered left all the time when driven manually. We wanted the neural net to learn to compensate for this drift.

We strapped on a camera, a netbook running a neural net I coded in C, and a USB gamepad for remote control. We used OpenCV to calculate optic flow from the camera and extract left/right motion and forward/backward speed measurements, which we fed into the network.

Block diagram of Canyonero's ANN

Straighten Up And Drive Right
Simple training functions looked at the optic flow data and told the net which way it should be driving to compensate for the drift. This is an example of supervised learning - we show the net the output we need, but not how to produce it.

Four buttons on the gamepad trained the network in each of the four directions. Hold down the "turn left" button and Canyonero would lurch off in a random direction (thanks to randomly initialized weights), then gradually correct itself and learn to execute a graceful left turn. Training typically took 250-300 steps to reach this point.

People usually check training results by looking at the root-mean-squared error between the correct training outputs presented to the net and the net's actual output. The graph below shows that for several "forward" Canyonero training runs. (Again, random initial weights are responsible for the different starting errors.)

If your state space doesn't have too many dimensions, you can also look at the state-space evolution of training. The 3D plot below shows a "turn right" training run - training outputs are green and the net's output is red. The convergence is much more dramatic.

Once a given direction was trained, another set of gamepad buttons offered "trained" driving, where the neural net would drive Canyonero in the direction you chose while compensating in real time for the drift from the wheels. In trained drive mode, the neural net acted completely on its own, with no target data.

Experimenting With Training
Training a neural net just means adjusting the values of the weights on the connections between neurons, and there are many, many ways to do that. We used the classic backpropagation algorithm, which is just gradient descent.

The most important parameter is the learning rate, which controls how much the weights change at each step. For Canyonero, keeping the learning rate low (0.05-0.25) struck a decent balance between learning speed and the correctness of the learned outputs (i.e., drift-free driving). In general, if the learning rate is too high, the weights may never converge on acceptable values.

We also added a momentum term, which folds in some of the weight's previous value upon each adjustment - this can prevent oscillation around a convergence point. As with learning rate, higher values increased learning speed, but if momentum was set too high (>0.5), learning wouldn't converge.

Finally, we incorporated weight decay, which drives the weight values toward zero. This is essentially an "unlearning" term that can help the network avoid learning the noise in the inputs. Predictably, higher values of weight decay caused training to take much longer, or never converge.

Of course, you can balance the various parameters in any way you like. The graph below shows high error as a result of a very high learning rate and lots of momentum (red), contrasted with the same rate and momentum values, plus some weight decay to smooth things out (green). Not a setup I'd recommend, but it can work.

Conclusions
Canyonero was a great platform to experiment with ANNs, and actually ended up working as planned! My neural network code also came in handy for future projects. I recommend coding your own ANN if you're interested in learning more - it's not actually that difficult, and all the math notation will become infinitely clearer.

Junkbox Raider

Wednesday, November 9, 2011

Neural networks part 1: Teaching Canyonero to drive

No comments:

Post a Comment