In the simplest terms, the main difference between supervised and unsupervised machine learning is that all supervised machine learning methods require some kind of labels, or truth.
On the other hand, unsupervised methods do not need, and by definition cannot use, the output data to train the method. Often times unsupervised methods are used when there is in fact no single answer, but we wish to try to understand the data in a general way. These are often less well suited to make explicit predictions. Rather, they are better used to ask other kinds of questions.
Supervised Machine Learning tasks can be broadly classified into two subgroups: regression and classification.
Unsupervised Machine Learning tasks can be broadly classified into two subgroups: clustering and generative modeling (aka dimensionality reduction).
Prediction methods are commonly referred to as supervised learning. Supervised methods are thought to attempt the discovery of the relationships between input attributes and a target attribute.
Regression is the problem of estimating or predicting a continuous quantity. What will be the value of the S&P 500 one month from today? How tall will a child be as an adult? How many of our customers will leave for a competitor this year?
Classification deals with assigning observations into discrete categories, rather than estimating continuous quantities. In the simplest case, there are two possible categories; this case is known as binary classification. Many important questions can be framed in terms of binary classification. Will a given customer leave us for a competitor? Does a given patient have cancer? Does a given image contain a hot dog?
Using demographic data we wish to predict if someone is a pet owner or not. We must “train" a supervised method by first giving it the input of our model (the demographic data), and the true output (pet owner or not). Once the method is trained, when we give it unseen demographic data, it can predict who is and is not a pet owner. This is a typical classification problem - either someone owns pets or they don't.
Again using demographic data, we wish to predict a person’s income. A supervised method would train on the demographic data (input) and the true income of people (output). Because incomes can be over a continuous range, this is a classic regression problem that can produce an income when given new, unseen demographic data, that hopefully adjusts the expected income in a way that makes sense. For example, hopefully the income guess would go up with education, or by experience, or perhaps even by location.
Supervised learning tasks find patterns where we have a dataset of “right answers” to learn from. Unsupervised learning tasks find patterns where we don’t. This may be because the “right answers” are unobservable, or infeasible to obtain, or maybe for a given problem, there isn’t even a “right answer” per se.
A large subclass of unsupervised tasks is the problem of clustering. Clustering refers to grouping observations together in such a way that members of a common group are similar to each other, and different from members of other groups.
A very interesting class of unsupervised tasks is generative modeling. Generative models are models that imitate the process that generates the training data. A good generative model would be able to generate new data that resembles the training data in some sense. This type of learning is unsupervised because the process that generates the data is not directly observable–only the data itself is observable.
Given some demographic data, can we identify subgroups of the population? What are the ways that subgroups of people can be defined, such as by race, or gender, or other measurable factors? How do different subgroups relate and intermix with others?
Again using demographic data, but this time with a timestamp so that individuals have multiple records, can we identify those members of the population that have started doing something different over a time period? Have people changed demographic subgroups? When a change is made in one demographic indicator, what are the other indicators that change at the same time?