A Brief Review of Active Learning

Introduction

The field of Machine Learning (ML) broadly deals with giving computers the ability to do tasks like humans. We are seeing a human-level performance in specialized tasks such as speech recognition, image recognition, etc. Many sub-fields of ML are active areas of research that take different approaches to learn from data. Supervised ML requires labeled input-output pairs to do function approximation so that it can do regression or classification tasks on future unseen data. The major assumption here is the availability of data labeled by subject experts. In many practical scenarios, the scarcity of labeled data or the cost of data labeling by domain experts limits the applicability of supervised ML algorithms. Active Learning (AL) addresses this problem by enabling minimal interaction with humans to learn efficiently.

Active Learning

The adaptive nature of AL algorithms prompts the experts to label the most informative data. This is analogous to a student asking for more examples of specific problems (confusing to him or her) to an instructor. This enables AL to leverage sparsely labeled data while learning fast. This is often done through an acquisition function that is used to quantify and pick difficult to predict data samples for labeling. The process is repeated as long as a stopping criterion is met. The stopping criterion could be problem-specific and is often a trade-off between desired performance and labeling cost.

Some Applications

One application area of AL is material research where costly prototyping of high-performance materials is required. Here AL is used to narrow down the search space efficiently. Another application area is the classification of electronic health records where unlabeled data is plenty and labeling by experts is very costly.

Acquisition Functions

A major part of AL research literature focuses on designing and learning acquisition functions. One straightforward research direction is the use of probabilistic models that use measures such as entropy and its variations. Another approach is leveraging Reinforcement Learning (RL) techniques in acquisition function design. Using RL, an agent learns to prompt for informative labels via experience. Adversarial neural networks have been used as a discriminator for the acquisition function and made use of unlabeled data to improve performance significantly. The use of the Bayesian framework in acquisition function design has shown good performance.

Few Research Directions

A long and diverse set of research directions exist in AL literature. However, a few overlooked directions could be a possible connection between AL and optimization problems, AL in regression problems, and eliminating the effects of bias in data through the AL framework. Provided AL algorithms learn quickly compared to conventional supervised learning methods, it would be interesting to see whether we can relate it to optimization techniques such as batch gradient descent. The effect of the batch process and to investigate whether it has got any relation with the order of AL label revelation might be an expository approach. Some very recent work deals with bias in data and using AL to improve model fairness by around 50%. Bias in data is not obvious to humans. It propagates to models and is difficult to handle. It could be possible to do incremental improvements in the performance of fair ML models.

Leave a Comment

Your email address will not be published.