What Data Scientists Should Know About Multi-output and Multi-label Training

Introduction

Multi-output learning subsumes many learning problems in multiple disciplines and deals with complex decision-making in many real-world applications. It has a multivariate nature and the multiple outputs may have complex interactions. Many strucurted inference problems have been architected by multi-output machine learning. The output values have diverse data types, depending on the type of ML problem.

For example,

0/1 based Binary output values can refer to a multi-label classification problem. Such examples also range from :

Nominal output values to a multi-dimensional classification problem
Ordinal output values to label ranking problem
Real-valued outputs to a multi-target regression problem.

Special Use Cases of Multi-output Learning

Multi-class Classification: Multi-class classification can be categorized as a traditional single-output learning paradigm when the output class is represented by the integer encoding. It can also be extended to a multi-output learning scenario if each output class is represented by the one-hot vector.

Fine-grained Classification: In this type of classification, though the vector representation is the same as fine-grained classification outputs to the multi-class classification outputs, their internal structures of the vectors are different. Labels under the same parent tend to have a closer relationship than the ones under different parents in the label hierarchy.

Multi-task Learning: Multi-task learning aims at learning multiple related tasks simultaneously, where each task outputs one single label, and learning multiple tasks is similar to learning multiple outputs. It leverages the relatedness between tasks to improve the performance of learning models. The major difference between multi-task learning and multi-output learning is that different tasks might be trained on different training sets or features in multi-task learning, while output variables usually share the same training data or features in multi-output learning.

In** Multi-output pattern recognition problems**, each instance in the dataset has two or more output values (nominal or real-valued)— i.e., the output value is a vector rather than a scalar. They are solved by any of the following methods:

By transforming the multi-label (or multi-output) into multiple single-output problems.
By adopting a pattern recognition algorithm so that it directly handles multi-output data.

#data-science #machine-learning #artificial-intelligence #python #multi-output-training

Introduction

Special Use Cases of Multi-output Learning

hackernoon.com

What Data Scientists Should Know About Multi-output and Multi-label Training