Towards a fully automated active learning pipeline

Towards a fully automated active learning pipeline

In my previous post, I gave a short introduction to the theory and methods of active learning. The next step in the Active Learning journey is the implementation. In this post, I will share my journey towards a complete automated active learning pipeline.

In my previous post, I gave a short introduction to the theory and methods of active learning. The next step in the Active Learning journey is the implementation. In this post, I will share my journey towards a complete automated active learning pipeline.

Step 1 — Motivation to implement

At first, like other motivated algorithm developers, I started by implementing the chosen active learning method. I didn’t think about the next steps and the “bigger picture”. On one hand, it got me pretty fast to a working (engineering-wise) implementation. On the other hand, it made the next steps harder. In my case I had two parallel next steps, that raises three different questions:

  1. How do I build an active learning pipeline?
  2. How to build the pipeline in a modular and generic way?
  3. How to incorporate multiple different tasks into the pipeline?

Step 2 — active learning pipeline — semi-automatic

My first active learning pipeline implementation was semi-automatic. Each cycle runs fully automatically but is executed manually. To this end, the main addition is the Data Selector.

Data Selector

The Data Selector encapsulates the purpose of active learning — choosing the next images to be annotated in an informed way. Its input is a set of currently non-annotated data and its output is a sub-set to be annotated.

The Data Selector can be an additional neural network, classic algorithm, database query, or any other method that works for you.

At each cycle, the Data Selector is based on the best model from the previous cycle. Its output set is added to the training set of the previous cycle (after annotating it).

Why is semi-automatic not good enough?

Besides the overhead of executing each cycle manually, which is time-consuming, a semi-automatic pipeline requires monitoring. By monitoring, I mean that we need to remember which cycle we want to run, where we save the state of the previous cycle, manually choosing the inference model from the previous cycle, and more. This process has a high potential for errors, bugs, and confusion. Ideally we want a fully automatic pipeline, which is the topic of our next section.

active-learning software-architecture pipeline machine-learning

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Hire Machine Learning Engineer | Offshore Machine Learning Experts

We are a Machine Learning Services provider offering custom AI solutions, Machine Learning as a service & deep learning solutions. Hire Machine Learning experts & build AI Chatbots, Neural networks, etc. 16+ yrs & 2500+ clients.

5 Latest Technology Trends of Machine Learning for 2021

Check out the 5 latest technologies of machine learning trends to boost business growth in 2021 by considering the best version of digital development tools. It is the right time to accelerate user experience by bringing advancement in their lifestyle.

Hire Machine Learning Developers in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Applications of machine learning in different industry domains

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.

Hire Machine Learning Developer | Hire ML Experts in India

We supply you with world class machine learning experts / ML Developers with years of domain experience who can add more value to your business.