An Artificial Intelligence (AI) system is only as good as its training. For AI Machine Learning (ML) and Deep Learning (DL) frameworks, the training data sets are a crucial element that defines how the system will operate. Feed it skewed or biased information and it will create a flawed inference engine.

MIT recently removed a dataset that has been popular with AI developers. The training set, 80 Million Tiny Images, was scraped from Google in 2008 and used in training AI software to identify objects. It consists of images that are labeled with descriptions. During the learning phase, an AI system will ingest the dataset and ‘learn’ how to classify images. The problem is that many of the images are questionable and the labels were inappropriate. For example, women are described with derogatory terms, body parts are identified with offensive slang, and racial slurs were sometimes used to label minority people. Such training should never be allowed.

Trending AI Articles:

1. Natural Language Generation:

The Commercial State of the Art in 2020

2. This Entire Article Was Written by Open AI’s GPT2

3. Learning To Classify Images Without Labels

4. Becoming a Data Scientist, Data Analyst, Financial Analyst and Research Analyst

#technology #artificial-intelligence #ethics #cybersecurity #ai

Teaching AI to be Evil with Unethical Data
1.10 GEEK