Uncovering Societal Bias in NLP Transfer Learning

Speakers: Benjamin Ajayi-Obe, David Hopes

Summary

The popularisation of large pre-trained language models has resulted in their increased adoption in commercial settings. However, these models are usually pre-trained on raw, unprocessed corpora that are known to contain a plethora of societal biases. In this talk, we explore the sources of this bias, as well as recent methods of measuring and mitigating it.

Description

Since the publication of Google’s seminal paper, “Attention is all you need”, attention based transformers have become widely celebrated and adopted for their impressive ability to emulate human-like text. However, it has become increasingly evident that, while these models are very capable of modelling text from a large corpus, they also embed societal biases present in the data. These biases can be difficult to detect unless intentionally inspected for or documented, and so they pose a real risk to organisations who wish to make use of state of the art NLP models, particularly those who have limited budgets to retrain them. This talk is for anyone who wishes to deepen their understanding of attention based transformers from an ethical standpoint and also those looking to deploy attention based models in a commercial setting. You will leave with a better understanding of the types of biases that pose a risk to attention based models, the source of this bias and potential strategies for mitigating against it. For this talk we presume the audience has a high level understanding of neural networks and some knowledge of linear algebra. The first 15 minutes will be a discussion around the types of bias that pose a risk to these models as well as some demonstrations of biased outputs. The second 15 minutes will be an exploration into strategies to detect and mitigate against these biases.

Benjamin Ajayi-Obe's Bio
I am a data scientist in the ranking and recommendation team of Depop. I am interested in the the development and application of NLP models for commercial use. I am also interested in the ethical implications of deploying AI solutions in the real world and exploring ways of ensuring fairness, equity and safety in a society that is increasingly adopting ML.

GitHub: https://github.com/BenAjayiObe/

David Hopes's Bio
Data scientist and ethical AI ambassador at Depop, currently focusing on ML solutions for the marketing technology team. Research interests in computational pragmatics and context in communication.

GitHub: https://github.com/davidhopes/
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps

#nlp #learning #pydata

Summary

Description

youtube.com

Uncovering Societal Bias in NLP Transfer Learning