Deep kernel transfer

A human approach to solving a new problem involves:

  • Making inferences beyond the information available.
  • Reusing previous …
Oct. 18, 2022 1 minute
Deep kernel transfer,Deep Learning

A human approach to solving a new problem involves:

  • Making inferences beyond the information available.
  • Reusing previous experiences.
  • Weighing alternatives in the face of uncertainty.

A primary objective of machine learning is to incorporate these abilities into artificial systems.

We use Gaussian Processes (GPs) to solve the problem of few-shot learning. Few-shot learning (FSL) is a machine learning method where the training dataset contains limited information.

In this paper, we describe a simple yet effective variant of deep kernel learning, called deep kernel transfer, that transfers the kernel across tasks.

Deep kernel learning?

Many relevant data are available for many problems, but there is not as much explicit knowledge.

Data sources that are noisy but abundant or data in which the target variable is a function may be used to make predictions about target variables. A lack of interpretable and flexible machine-learning methods prevents data fusion across sources. 

Deep Gaussian Processes (DGPs) are generalized so that GPs in intermediate layers can represent the posterior distribution summarizing data from a related source. 

The prior-posterior stacking DGPs are modeled with a single GP.

DGP’s second moment is calculated analytically and taken as the kernel function of GPs.

Based on limited direct observations, the kernel captures compelling correlation through function composition, reflects the structure of observations from other data sources, and can be used to make predictions.

Therefore, based on the data, prior-posterior DGPs can be considered a novel composition in blending kernels in different layers. 

Two synthetic multi-source prediction problems are discussed: a) predicting a target variable that is only a function of the source data and b) predicting noise-free data using a kernel trained on noisy data. 

Data-informed approximate DGPs provide better predictions and tighter uncertainty on synthetic data than standard GPs and other DGPs methods.

Bayesian meta-learning for the few-shot setting via deep kernels

Different machine learning methods have recently been introduced to address the challenge of few-shot learning. The method involves learning from a small dataset of labeled data related to a specific task. 

Meta-learning is a common approach. In recognition that meta-learning is a multi-level model, Bayesian treatment for the meta-learning inner loop based on deep kernels is proposed.

This allows learning a kernel that can also be applied to new tasks. This is called Deep Kernel Transfer (DKT).

In addition to being straightforward to implement as a single optimizer, this approach also provides uncertainty quantification and requires no task-specific parameter estimation.

The empirical results show that DKT outperforms several state-of-the-art algorithms in few-shot classification, cross-domain adaptation, and regression.

It is concluded that a simpler Bayesian model can replace complex meta-learning routines without compromising accuracy.

Transfer of learning matrix

An effective way to present the transfer of learning is through a matrix. On the Main Menu, you will find a Transfer of Learning matrix that includes suggestions adapted from research literature and the experiences of supervisors, trainers, and learners.

Conclusion

Machine learning using few-shot learning (FSL) involves training datasets with limited information. Different machine learning methods have recently addressed the few-shot learning challenge. This method consists in learning from a small dataset of labeled data related to a specific task. A simpler Bayesian model can replace complex meta-learning routines without compromising accuracy.