What is meta learning in machine learning
Meta-learning within machine learning references to learning algorithms that go about learning from other learning algorithms.
Most typically, this implies the leveraging of machine learning algorithms that go about learning how to best bring together the predictions from other machine learning algorithms in the domain of ensemble learning.
Nonetheless, meta-learning might also reference to the manual process of model selection and algorithm tuning carried out by a practitioner on a machine learning project that sophisticated automl algorithms look to automate. It also references to learning across several connected predictive modelling activities referred to as multi-task learning, where meta-learning algorithms learn how to learn.
In this guide, you will find out about meta-learning within machine learning.
After going through this guide, you will be aware of:
- Meta-learning references to ML algorithms that go about learning from the output of other machine learning algorithms.
- Meta-learning algorithms usually refer to ensemble learning algorithms such as stacking that learn how to combine the forecasts from ensemble members.
- Meta-learning also references to algorithms that learn how to learn across a suite of connected forecasting tasks, referred to as multi-task learning.
Tutorial Summarization
This guide is subdivided into five portions, which are:
1] What is Meta?
2] What is Meta-Learning?
3] Meta-algorithms, meta-classifiers, and meta-models
4] Model selection and tuning as meta-learning
5] Multi-task learning as meta-learning
What is meta?
Meta is a reference to a level above.
Meta usually means increasing the level of abstraction one step and usually refers to data with regards to something else.
For instance, you are likely acquainted with “meta-data,” which can be defined as data about data.
You record data in a file and a typical instance of metadata is data about the data recorded in the file, such as:
- The name of the file
- The size of the file
- The date the file was created
- The date the file was last modified
- The type of file
- The path to the file
Now that we are acquainted with the idea of “meta”, let’s take up the leveraging of the term in ML, like “meta-learning”
What is meta-learning?
Meta-learning can be defined as learning about learning.
Meta-learning within machine learning most usually refers to machine learning algorithms that learn through the output of other machine learning algorithms.
In our ML project where we are attempting to find out (learn) what algorithm performs ideally on our data, we could perceive of a machine learning algorithm taking the place of ourselves, at least to some degree.
Machine learning algorithms go about learning from historical information. For instance, supervised learning algorithm learn how to map instances of input patterns to instances of output patterns to tackle classification and regression predictive modelling problems.
Algorithms receive training on historical information directly to generate a model. The model can then be leveraged later to forecast output values, like a number or a class label, for new instances of input.
Learning Algorithms: Learn from historical data and make forecasts provided new instances of data.
Meta-learning algorithms go about learning from the output or other machine learning algorithms that learn from data. This implies that meta-learning needs the existence of other learning algorithms that have already received prior training on data.
For instance, supervised meta-learning algorithms go about learning how to map instances of output from other learning algorithms (like forecasted numbers or class labels) onto instances of target values with regards to classification and regression problems.
Likewise, meta-learning algorithms make forecasts by taking the output from current machine learning algorithms as input and forecasting a number or class label.
- Meta-learning Algorithms: Learn from the output of learning algorithms and make a forecast provided forecasts made by other models.
In this fashion, meta-learning happens one level above machine learning.
If ML learns how to ideally leverage information in data to make forecasts, then meta-learning or meta machine learning learns how to ideally use the forecasts from ML algorithms to make forecasts.
No that we are acquainted with the concept of meta-learning, let’s look into some instances of meta-learning algorithms.
Meta-Algorithms, Meta-Classifiers, and Meta-Models
Meta-learning algorithms are often referenced to merely as meta-algorithms or meta-learners.
Meta-algorithm: Short-hand for a meta-learning machine learning algorithm.
Likewise, meta-learning algorithms for classification activities might be referred to as meta-classifiers and meta-learning algorithms for regression activities might be referred to as meta-regressors.
- Meta-classifier: Meta-learning algorithm for classification predictive modelling activities
- Meta-regression: Meta-learning algorithm for regression predictive modelling activities.
After a meta-learning algorithm has received training, it has the outcome of a meta-learning model, for example, the particular rules, coefficients or structure learned from data. The meta-learning model or meta-model can subsequently be leveraged to make forecasts.
Meta-model: Outcome of executing a meta-learning algorithm.
The most broadly known meta-learning algorithm is referred to as stacked generalization, or stacking for short.
Stacking is likely the most-popular meta-learning strategy.
Stacking is a variant of ensemble learning algorithm. Ensemble learning is a reference to machine learning algorithms that bring together the forecasts for two or more predictive models. Stacking leverages another ML model, a meta-model, to learn how to ideally combine the forecasts of the contributing ensemble members.
In order to induce a meta classifier, first, the base classifiers receive stage one training, learning algorithms that bring together the forecasts for two or more predictive models. Stacking leverages another machine learning model, a meta-model, to learn how to ideally combine the forecasts of the contributing ensemble members.
In order to induce a meta classifier, first, the base classifiers receive stage one training, and then the Meta classifier (second stage). In the forecast phase, base classifiers will output their classifications, and then the Meta-classifier(s) will make the final classification (as a function of the base classifiers).
As such, the stacking ensemble algorithm is referenced to as a variant of meta-learning to bring together the forecasts made by ensemble members.
There are also lesser-known ensemble learning algorithms that leverage a meta-model to learn how to bring together the predictions from other ML models. Most noteworthy, a mixture of experts that leverages a gating model (the meta-model) to learn how to bring together the predictions of expert models.
By leveraging a meta-learner, this strategy attempts to induce which classifiers are reliable and which are not.
In a more general sense, meta-models for supervised learning are nearly always ensemble learning algorithms, and any ensemble learning algorithm that leverages another model to bring together the forecasts from ensemble members may be referred to as a meta-learning algorithm.
Rather, stacking introduces the idea of a metalearner […] Stacking attempts to learn which classifiers are the dependable ones, leveraging another learning algorithm – the metalearner – to find out how ideally to combine the output of the base learners.
Model Selection and Tuning as Meta-Learning
Training an ML algorithm on a historical dataset is a search procedure.
The internal structure, rules or coefficients that make up the model are altered against some loss function. This variant of search procedure is referenced to as optimization, as we are not merely looking for a solution, but a solution that maximizes a performance metric such as classification or minimizes a loss score like forecasting error.
This concept of learning as optimization is not merely a useful metaphor, it is the literal computation carried out at the core of most machine learning algorithms, either analytically (least squares) or numerically (gradient descent) or some hybridized optimization process.
A level above training a model, the meta-learning consists of identifying a data prep process, learning algorithm, and learning algorithm hyperparameters (the full modelling pipeline) that have the outcome of the ideal score for a performance metric on the test harness.
This, too, is an optimization process that is usually performed by a human agent.
As such, we could perceive of ourselves as meta-learners on an ML project.
This is not the typical meaning of the term, yet it is a valid usage. This would cover activities like model selection and algorithm hyperparameter tuning.
Automating the process is typically referenced to as automated machine learning, shortened to “automl”
The user merely furnishes data, and the AutoML framework automatically decides the strategy that performs best for this specific application. Thereby, AutoML makes state-of-the-art ML strategies accessible to domain scientists who have interest in application of machine learning but do not possess the resources to learn about the technologies behind it in detail.
AutoML might not be referenced to as meta-learning, however, automl algorithms might leverage meta-learning across learning activities, referenced to as learning to learn.
Meta-learning or learning to learn, is the science of observing how differing ML strategies perform on a broad array of learning activities, in a systematic fashion. Learning from this experience, or meta-data, to learn new tasks much quicker than otherwise possible.
Multi-task learning as meta-learning
Learning to learn is a connected domain of study that is also commonly referenced to as meta-learning.
If learning consists of an algorithm that improves with experience on an activity, then learning to learn is an algorithm that is leveraged across several activities that improves with experiences and tasks.
An algorithm is stated to learn to learn if its performance at every task improves with experience and with the number of activities.
Instead of manually generating an algorithm for every task or choosing and tuning an existing algorithm for every task, learning to learn algorithms adjust themselves on the basis of a collection of similar tasks.
Meta-learning furnishes an alternative paradigm where a ML model obtains experience over several learning episodes – often covering a distribution of connected tasks – and leverages this experience to enhance its future learning.
This is referenced to as the problem of multi-task learning.
Algorithms that are created for multi-task learning problems learn how to learn and might be referenced to as performing meta-learning.
The concept of leveraging learning to learn or meta-learning to obtain knowledge or inductive biases has a long history.
Learning to learn: Application learning algorithms on multi-task learning problems in which they carry out meta-learning across the activities, example, learning about learning on the activities.
This consists of familiar strategies like transfer learning that are typical in deep learning algorithms for computer vision. This is where a deep neural network is trained on a single computer vision task and is leveraged as the beginning point, probably with very minimal modification or training for a connected vision task.
Transfer learning functions well when the features that are automatically extracted by the network from the input images are good across several connected activities, like the abstract features extracted from common objects in photographs.
This is usually comprehended in a supervised learning scenario, where the input is the same but the target might be of a differing nature. For instance, we might learn about one grouping of visual categories, like cats and dogs, in the first setting, then learn about a differing grouping of visual categories, like ants and wasps, in the second setting.
Further Reading
This portion of the blog furnishes additional resources on the subject if you are seeking to delve deeper.
Papers
Learning to Learn: Introduction and Overview, 1998.
Learning to learn by Gradient Descent by Gradient Descent, 2016.
Meta-learning in Neural Networks: A Survey, 2020.
Books
Ensemble methods, 2012
Pattern Classification using ensemble methods, 2010
Data mining: Practical machine learning tools and techniques, 2016
Deep learning, 2016
Automated machine learning: methods, systems, challenges, 2019
Articles
Meta, Wikipedia
Metamodeling, Wikipedia
Meta learning (computer science), Wikipedia
Conclusion
In this guide, you found out about meta-learning within machine learning.
Particularly, you learned about:
- Meta-learning is a reference to machine learning algorithms that go about learning from the output of other ML algorithms.
- Meta-learning algorithms usually refer to ensemble learning algorithms such as stacking that learn how to bring together the forecasts from ensemble members.
- Meta-learning also references to algorithms that learn how to learn across a suite of connected prediction activities, called as multi-task learning.