Human-in-the-loop machine learning (HITL)

June 6, 2022
Hot topics 🔥
AI & Data Science
Mario Grunitz
Human-in-the-loop machine learning (HITL)

If we allow our imagination to run wild for a moment, ‘human in the loop machine learning’ might conjure images of the beginning of the robot uprising, with humans drafted into an inferior role assisting machines to learn. The reality is very different and of course a lot less dramatic and dystopian.

Human in the loop (HITL) machine learning is simply a way of improving the speed and accuracy at which an AI algorithm can learn under certain conditions, combining human and artificial intelligence to build effective machine learning models. With HITL machine learning, humans are involved in both the training and testing stages of building a machine learning algorithm.  This creates a continuous feedback loop that enables the machine to produce better results each time; to learn more quickly and improve the accuracy of the AI decision making.

AI systems are good at learning to make optimal decisions when there is a large, high-quality dataset. In the real world, however, such datasets are rare which often limits machine learning capabilities. Human intelligence, on the other hand, is good at recognising patterns within small and poor-quality datasets.  Combining these different intelligence skills in a feedback loop enhances machine learning, and is the purpose of HITL machine learning.

In short, HITL machine learning is a set of strategies for combining human and machine intelligence in applications that use AI, typically with a goal to:

  • Increase the accuracy of machine learning 
  • Achieve the target accuracy for a machine learning model faster
  • Combine human and machine intelligence to maximise accuracy
  • Assist human tasks with machine learning to increase efficiency


Why is it so useful to combine human and artificial type intelligence in a feedback-driven machine learning process?  On the surface, one can answer the question by pointing to the fact that AI processes data faster than humans, and as a result can learn very effectively from large, high-quality datasets, but not from small, low-quality datasets.  Human intelligence, on the other hand, is very capable of pattern recognition within small, low-quality datasets – so the HITL match seems obvious.  

But the above reasoning fails to answer why this difference exists.

Human intelligence is capable of many skills unavailable to artificial intelligence at this time, such as creativity, imagination, and a compulsive need to create understanding; not just semantics, but abstract meaning.  This makes it possible for a human, for example, to see a portion of the tail of a cat in an image, and know it’s a cat.  Similar creative extrapolation is not possible for artificial intelligence at this stage; hence the need for AI learning to be built upon large datasets of all possible representations of a cat, so that it can recognise a cat at an odd angle or when partially hidden.

HITL enables this learning process and knowledge transfer from human intelligence to artificial.

The HITL machine-learning process

The process of HITL machine learning can be broken down into two broad stages; training and testing.

Training (labelling) 

As inferred above, because AI systems are effective when there is a large, high-quality dataset, HITL is best used to assist machine learning when datasets are small or of poor quality.  In the first stage of HITL – training – humans label both the input and corresponding expected output training data.  This process, which provides the algorithm with data to support future judgements is called supervised machine learning. The objective of training is to enable the algorithm to make accurate decisions when presented with new data.  

On the other hand, in unsupervised machine learning, unlabelled datasets are used. Under these circumstances, the algorithm is designed to seek and define its own structure of the unlabelled data. This falls under the HITL deep learning approach.

Testing and evaluation 

In both supervised and unsupervised HITL machine learning, the purpose of testing and evaluation is to allow humans to correct any inaccurate results the algorithm produces when presented with new data. There are broadly two categories of inaccurate decisions: those where the algorithm has low confidence of accuracy (edge cases), and those where the algorithm is confident, but the result is incorrect. 

Active learning is the process of feedback from human to machine of the interpreted low confidence results. The purpose of testing and evaluation is to enable the algorithm to improve decision-making such that it is ultimately not reliant on human intervention.

The consolidation of the above processes, training, testing and evaluation, creates a continuous feedback loop between humans and the learning machine, improving the accuracy and consistency of the algorithm by refining and expanding the scope of the edge cases. Over time the machine can even begin to analyse its own performance, identifying areas where it is less effective. This data is then sent to humans, improving the efficiency of feedback and the overall HITL machine learning process.

Pros and cons of HITL machine learning

Rapid machine learning with high-quality results while using small and/or poor-quality datasets is the main advantage of HITL and is a consequence of the direct correlation between the quality of training data and the performance of machine learning (i.e., HITL improves the quality of the data, and this, in turn, improves the performance of machine learning).  Data labelling combined with consistent feedback on the algorithm’s decisions enhances the machine learning process.

On the downside, however, data labelling and continuous feedback are costly and time consuming manual processes.  Labelling requires people to annotate and categorise image, text, audio, or other files.  Whether this is done in-house or outsourced, it represents a significant cost, as does continuous human feedback.  In practice, and to save costs, it is necessary to determine what level of confidence is acceptable for the automated machine process: If it is not detrimental that occasional wrong decisions occur, confidence thresholds can be set lower, which requires less human intervention and therefore reduces the cost of HITL machine learning.

In summary 

In a nutshell, human in the loop (HITL) machine learning relies on human feedback to improve the quality of data used to train machine learning models and advances the rate of machine learning with the use of a continuous improvement feedback loop between machines and humans.  

While AI is competent at independently learning from large, high-quality datasets, such datasets are rare in the business world and are very expensive to create.  HITL overcomes this problem by combining human and artificial intelligence to facilitate machine learning by leveraging the specific quality of human intelligence to recognise patterns in small and/or poor-quality datasets, thereby facilitating machine learning.

Mario Grunitz

Mario is a WeAreBrain Co-founder. With more than 15 years of experience in the tech space, he has worked all over Europe and held countless leadership positions in corporate, startup and agency spheres.

Working Machines

An executive’s guide to AI and Intelligent Automation. Working Machines takes a look at how the renewed vigour for the development of Artificial Intelligence and Intelligent Automation technology has begun to change how businesses operate.