The problem

The vast growth in technology and social media has revolutionized our lives by transforming the way we connect and communicate. Unfortunately, the darker side of this development has exposed a lot of children and teenagers from various ages to become victims of online sexual abuse.

To help combat the severity of the problem, I joined an Omdena project together with the Zero Abuse Project. Among 45 Omdena collaborators from across 6 continents, the goal was to build AI models to identify patterns in the behavior of institutions when they cover-up incidents of sexual abuse.

The identification and analysis of sexual crimes assure public safety and has been made possible by leveraging AI. Natural Language Processing and various machine learning techniques have played a major in the successful identification of online sexual abuse.

Classifying online chats

The main idea of this task was to classify online chats between two persons as sexual abuse or non-sexual abuse. We planned in implementing this by using text mining and deep learning techniques such as LSTM-RNN. In the following example, our idea aimed at classifying the chats as predatory or non-predatory.

Image for post

Let’s dive into it!

We have used the open-source PAN2012 dataset provided in the context of the Sexual Predator Identification (SPI) Task in 2012 initiated by PAN (Plagiarism analysis, authorship identification, and near-duplicate detection) lab. However, the realistic data provided by PAN has a high noise level with unbalanced training samples and varying length of conversations.

The challenging part of this dataset was in changing the chat text abbreviations and cyber slang texts such as “u” for “you”, “ur” for “your” and “l8r” for “later”. Such words are necessary for feature selection and for improving the performance of the model used for the classification.

#artificial-intelligence #sexual-abuse #cybersecurity #sexual-harassment #sexual-assault

An Attempt to Identify Cybersex Crimes through Artificial Intelligence
1.10 GEEK