Artificial Intelligence and Machine Learning are awesome. They allow our mobile assistants to understand our voices and book us an Uber. AI and Machine Learning systems recommend us books in Amazon, similar to the ones we’ve liked in the past. They might even make us have an amazing match in a dating application and meet the love of our life.

All of these are cool but potentially harmless applications of AI: If your voice assistant doesn’t understand you, you can just open the Uber application and order a car yourself. If Amazon recommends you a book that you might not like, a little research can make you discard it. If an app takes you on a blind date with someone who is not a good match for you, you might even end up having a good time meeting somebody who’s personality might be bewildering.

Things get rough, however, when AI is used for more serious tasks like filtering job candidates, giving out loans, accepting or rejecting insurance requests, or even for medical diagnosis. All of the previous decisions, partially assisted or completely taken care of by AI systems can have a tremendous impact on somebody’s life.

For these kinds of tasks, the data that is fed into the Machine Learning systems that sit at the core of this AI applications has to be contentiously studied, trying to avoid the use of information proxies: pieces of data that are used to substitute another one that would be more legitimate and precise for a certain task but that is not available.

Let’s go to the example of car insurance requests that are automated by a machine learning system: an excellent driver that lives in a poor and badly regarded area could get a car insurance request rejected if ZIP code is used as a variable in the model, instead of pure driving and payment metrics.

Aside from these proxies, AI systems also depend on the data that they were trained with in another manner: training in non-representative samples of a population, or training on data that has been labelled with some sort of bias, produces the same bias in the resulting system.

Let’s see some examples of bias derived from AI.

Tay: The offensive Twitter Bot

Image for post

Tay (Thinking about you) was a Twitter Artificial Intelligence chatbot designed to mimic the language patterns of a 19 year old american girl. It was developed by Microsoft in 2016 under the user name TayandYou, and was put on the platform with the intention of engaging in conversations with other users, and even uploading images and memes from the internet.

After 16 hours and 96000 tweets it had to be shut down, as it began to post inflammatory and offensive tweets, despite having been hard-coded with a list of certain topics to avoid. Because the bot learned from the conversations it had, when users that interacted with it started tweeting politically incorrect phrases, the bot learned these patterns and started posting conflicting messages about certain topics.

#startup #data-science #artificial-intelligence #machine-learning #technology

Tay: The offensive Twitter Bot

towardsdatascience.com

Bias in Artificial Intelligence