A simple example of learning

In this video I am going to show you an example of machine learning. It is a very simple kind of Neural Net and it is gonna be learning to recognize digits. And you gonna be able to see how the weights evolve, as we run a very simple learning algorithm.

So we gonna look at the very simple learning algorithm for training a very simple network to recognize handwritten shapes.

The network has two layers of neurons. It has got input neurons. whose activities represent the intensity of pixels, and output neurons, whose activities represent the class “classes”.

What we'd like is that when we show a particular shape the output neuron for that shape gets active.

If a pixel is active what it does is it votes for particular shapes. Namely the shapes that contain that pixel.

Each inked pixel can vote for several shapes. and the votes can have different intensities, the shape that gets the most vote wins. So we are assuming there is a competition between the output units, and that something I haven't explained yet will explain in a later lecture.

So first, we need to decide how to display the weights. And it seems natural to write the weights on the connection between input unit and output unit. But, we are never able to see what was going on if we get that. We need a display in which we can see the values of thousands of weights. So the idea is for each output unit, we make a little map. And in that map we show the strength of connection coming from each input pixel in the location of that input pixel. And we show the strength of connection by using black and white blobs, whose area represents the magnitude, and whose color represents the sign. So the initial weights that you see there are just small random weights.

Now what we are gonna do is show that network some data and get it to learn weights that are better than the random weights.

The way we are gonna look is when we show it an image, we are going to increment the weights from the active pixels in the image to the correct class. If we just did that, the weights could get only bigger and eventually every class will get huge input whenever we show it to the image.

So we need some way of keeping the weights under the control. What we gonna do is we will also gonna decrement the weights from the active pixels to whatever class the network guesses. So we're really training it to do the right thing, rather than (??) currently has a tendency to do.

If of course it does the right thing, then the increments we make, in the first step the learning rule will exactly cancel the decrements so nothing will change, which is what we want to.

So, these are the initial weights. Now we are going to show you few hundred training examples and then look at the weights again. So now the weights have changed, they started to form regular patterns. And we show it a few more hundred examples. And the weights have changed small, and a few more hundred examples, and a few more hundred examples. Few more hundred. and now the weights are pretty much at their final values. I'll talk more in future lectures about precise details of the learning algorithm. But what you can see is the weights now look like the little templates from the shapes. If you look at the weights going into the one unit for example, they don't (??) little template for identifying ones. They are not quite templates. If you look at the weights going into the nine unit , they don't have any positive weight below the half way line. That's because for telling the difference between 9s and 7s the weights below the half way line aren't much used. You have to tell the difference by deciding whether there is a loop at the top or horizontal bar at the top. And so, those output units are focused on that discrimination.

One thing about this learning algorithm is because the network is so simple, it's unable to learn a very good way of discriminating shapes. What it learns is equivalent to having a little template for each shape. And then deciding the winner based on which shape has the template that overlapped most with the ink. The problem is that the weights in which handwritten digits vary are much too complicated to be captured by simple template matches of whole shapes. You have to model allowable variation for digits by first extracting features and then looking arrangement of those features.

So here is examples we have seen already. If you look at those 2's in green box. You can see there is no template that will fit all those well and will fail to fit that 3 in the red box there. So the task simply can't be solved by a simple network like that. The network did the best it could but it can't solve this problem.

cursos

Herramientas de usuario

Herramientas del sitio

A simple example of learning

Herramientas de la página