Automated Adverse Drug Event (ADE) Detection from Text in Spark NLP with BioBert

Adverse Drug Reactions (ADRs) or Adverse Drug Events (ADEs) are potentially very dangerous to patients and are amongst the top causes of morbidity and mortality [1]. Many ADRs are hard to discover as they happen to certain groups of people in certain conditions and they may take a long time to expose. Healthcare providers conduct clinical trials to discover ADRs before selling the products but normally are limited in numbers. Thus, post-market drug safety monitoring is required to help discover ADRs after the drugs are sold on the market [2].

Recently unstructured data such as medical reports [3] or social network data [4] have been used to detect content that contains ADRs. Case reports published in the scientific biomedical literature are abundant and generated rapidly. Social networks are another source of redundant data with unstructured format. While an individual tweet or Facebook status that contains ADRs may not be clinically useful, a large volume of these data can expose serious or unknown consequences.

Given the need for collecting ADRs from various resources that are not composed in a structured manner (i.e. tweet, news, web forum etc.) as well as scientific papers (i.e. PubMed, arxiv, white papers, clinical trials, etc.), we wanted to build an end-2-end NLP pipeline to detect if a text contains possible ADRs, and extracting the ADR and Drug entities mentioned.

#tensorflow #naturallanguageprocessing #python #healthcare #apache-spark

towardsdatascience.com

Automated Adverse Drug Event (ADE) Detection from Text in Spark NLP with BioBert