Introduction to supervised and ensemble Learning

Learning from data, in order to make predictions, is an important problem in computer science. To tackle this problem, many learning algorithms have been developed. It is well-known that prediction performance can be improved by combining several individual predictions that are based on distinct learning processes. Such combination techniques are called ensemble methods.

Theory and practice in supervised learning has a long history, and many learning algorithms were proposed in the literature. A learning algorithm is a technique for learning a mapping from input space to its output space from a given dataset. Input and output space are typically high dimensional, the task of a learner is to approximate the mapping function Ω in order to get low error on unknown test examples (unseen examples). This means the algorithm must generalize to new data points. If input and output space is high dimensional the problem of sparseness is a big issue.

The application of different algorithms on the same problem leads to different approximations of the mapping function Ω. This behavior can be used to construct a better approximation so that the error on the test set becomes lower than the error of any individual algorithm.

Generalization is a major concern in all supervised models. There is always a tradeoff how well the model can fit the data, also known as bias-variance tradeoff. If the fit is too accurate it may not perform well on new data. If the model is not powerful enough to extract the underlying function, the error on unknown data is also high.