Three types of learning

In this video, I'm gonna talk about three different types of machine learning: supervised learning, reinforcement learning and unsupervised learning.

Broadly speaking, the first half of the course will be about supervised learning. The second half of the course will be mainly about unsupervised learning, and reinforcement learning will not be covered in the course, because we can't cover everything.

Learning can be divided into three broad groups of algorithms.

In supervised learning, you're trying to predict an output when given an input vector, so it's very clear what the point of supervised learning is.
In reinforcement learning, you're trying to select actions or sequences of actions to maximize the rewards you get, and the rewards may only occur occasionally.
In unsupervised learning you're trying to discover a good internal representation of the input and we'll come later to what that might mean.

Supervised Learning

Supervised learning itself comes in two different flavors.

In regression, the target output is a real number or a whole vector of real numbers, such as the price of a stock in six months time, or the temperature at noon tomorrow. And the aim is to get as close as you can to the correct real number.
In classification, the target output is a class label. The simplest case is a choice between one and zero.

Between positive and negative cases. But obviously, we can have multiple alternative labels as when we're classifying handwritten digits.

Supervised learning works by initially selecting a model class, that is, a whole set of models that we're prepared to consider as candidates.

You can think of a model class as a function that takes an input vector and some parameters and gives you an output y. So a model class is simply a way of mapping. An input to an output using some numerical parameters W and then we adjust these numerical parameters to make the mapping fit the supervised training data.

What we mean by fit is minimizing the discrepancy between the target output on each training case and the actual output produced by a machine learning system. And an obvious measure of that discrepancy, if we're using real values as outputs, is the square difference between the output from our system y, and the correct output t, and we put in that one-half, so it cancels the two when we differentiate.

For classification you could use that measure, but there's other more sensiblbe measures which we'll come to later, and these more sensibile measures typically work better as well.

Reinforcement learning

In reinforcement learning, the outputs an actual sequence of actions, and you have to decide on those actions based on occasional rewards.

The goal in selecting each action is to maximize the expected sum of the future rewards, and we typically use a discount factor so that you don't have to look too far in the future. We say that rewards far in the future don't count for as much as rewards that you get fairly quickly.

Reinforcement learning is difficult. It's difficult because the rewards are typically delayed, so it's hard to know exactly which action was the wrong one in a long sequence of actions.

It's also difficult because a scalar award, especially one that only occurs occasionally, does not supply much information, on which to base the changes in parameters. So typically, you can't learn millions of parameters using reinforcement learning. Whereas supervised learning and unsupervised learning, you can. Typically, in reinforcement learning, you're trying to learn dozens of parameters or maybe 1,000 parameters, but not millions.

In this course, we can't cover everything, and so we're not going to cover reinforcement learning, even though it's an important topic.

Unsupervised learning

Unsupervised learning, is going to be covered in the second half of the course.

For about 40 years, the machine learning community basically ignored unsupervised learning except for one very limited form called clustering. In fact, they used definitions of machine learning that excluded it. So they defined machine learning, in some textbooks, as mapping from inputs to outputs.

And many researchers thought that clustering was the only form of unsupervised learning. One reason for this is that it's hard to say what the aim of unsupervised learning is. One major aim is to get an internal representation of the input, that is useful for subsequent supervised or reinforcement learning. And the reason we might want to do that in two stages, is we don't want to use, for example, the payoffs from reinforcement learning, in order to set the parameters, for our visual system.

So you can compute the distance to a surface by using the disparity between the images you get in your two eyes. But you don't want to learn to do that computation of distance by repeatedly stubbing your toe and adjusting the parameters in your visual system every time you stub your toe. That would involve stubbing your toe a very large number of times and there's much better ways to learn to fuse two images based purely on the information in the inputs.

Other goals for unsupervised learning are to provide compact, low dimensional representations of the input. So, high-dimensional inputs like images, typically, live on or near a low-dimensional manifold (or several such manifolds in the case of the handwritten digits).

What that means is, even if you have a million pixels, there aren't really a million degrees of freedom in what can happen. There may only be a few hundred degrees of freedom in what can happen. So what we want to do is move from a million pixels to a representation of those few hundred degrees of freedom which will be according to saying where we are on a manifold. Also we need to know which manifold we're on.

A very limited form of this is principle commands analysis which is linear. It assumes that there's one manifold, and the manifold is a plane in the high dimensional space.

Another definition of unsupervised learning, or another goal for unsupervised learning, is to provide an economical representation for the input in terms of learned features. If, for example, we can represent the input in terms of binary features, that's typically economical cuz then it takes only one bit to say the state of a binary feature. Alternatively we could use a large number of real valued features but insist that for each input almost all of those features are exactly zero. In that case for each input we only need to represent a few real numbers and that's economical.

As I mentioned before, another definition of unsupervised learning or another goal of unsupervised learning is to find clusters in the input, and clustering could be viewed as a very sparse code, that is we have one feature per cluster and we insist that all the features except one are zero and that one feature has a value of one.

So clustering is really just an extreme case of finding sparse features.

cursos

Herramientas de usuario

Herramientas del sitio

Tabla de Contenidos

Three types of learning

Supervised Learning

Reinforcement learning

Unsupervised learning

Herramientas de la página