How to create a QA System on your own (private) data with cdQA-suite
The history of Machine Comprehension (MC) has its origins along with the birth of first concepts in Artificial Intelligence (AI). The brilliant Allan Turing proposed in his famous article “Computing Machinery and Intelligence” what is now called the Turing test as a criterion of intelligence. Almost 70 years later, Question Answering (QA), a sub-domain of MC, is still one of the most difficult tasks in AI.
However, since last year, the field of Natural Language Processing (NLP) has experienced a fast evolution thanks to the development in Deep Learning research and the advent of Transfer Learning techniques. Powerful pre-trained NLP models such as OpenAI-GPT, ELMo, BERT and XLNet have been made available by the best researchers of the domain.
With such progress, several improved systems and applications to NLP tasks are expected to come out. One of such systems is the cdQA-suite, a package developed by some colleagues and me in a partnership between Telecom ParisTech, a French engineering school, and BNP Paribas Personal Finance, a European leader in financing for individuals.Open-domain QA vs. closed-domain QA
When we think about QA systems we should be aware of two different kinds of systems: open-domain QA (ODQA) systems and closed-domain QA(CDQA) systems.
Open-domain systems deal with questions about nearly anything, and can only rely on general ontologies and world knowledge. One example of such a system is DrQA, an ODQA developed by Facebook Research that uses a large base of articles from Wikipedia as its source of knowledge. As these documents are related to several different topics and subjects we can understand why this system is considered an ODQA.
On the other hand, closed-domain systems deal with questions under a specific domain (for example, medicine or automotive maintenance), and can exploit domain-specific knowledge by using a model that is fitted to a unique-domain database. The cdQA-suite was built to enable anyone who wants to build a closed-domain QA system easily.cdQA-suite
The cdQA-suite is comprised of three blocks:
I will explain how each module works and how you can use it to build your QA system on your own data.cdQA
The cdQA architecture is based on two main components: the Retriever and the Reader. You can see below a schema of the system mechanism.
Mechanism of cdQA pipeline
When a question is sent to the system, the Retriever selects a list of documents in the database that are the most likely to contain the answer. It is based on the same retriever of DrQA, which creates TF-IDF features based on uni-grams and bi-grams and compute the cosine similarity between the question sentence and each document of the database.
After selecting the most probable documents, the system divides each document into paragraphs and send them with the question to the Reader, which is basically a pre-trained Deep Learning model. The model used was the Pytorch version of the well known NLP model BERT, which was made available by HuggingFace. Then, the Reader outputs the most probable answer it can find in each paragraph. After the Reader, there is a final layer in the system that compares the answers by using an internal score function and outputs the most likely one according to the scores.Using the cdQA python package
Before starting using the package, let's install it. You can install it using
pip install cdqa , but for this tutorial, I will install it from the source so I can run a script that downloads pre-trained models and the BNP dataset (a dataset with articles extracted from their public news webpage).
# Setting up cdQA package git clone https://github.com/cdqa-suite/cdQA.git && cd cdQA && pip install . # Download models and BNP dataset python download.py
Now, you can open a jupyter notebook and follow the steps below to see how cdQA works:
You should have something like the following as output:
The output of a QAPipeline prediction
You can notice that the system not only outputs an answer, but also theparagraph where the answer was found and the title of the document / article.
In the snippet above, the preprocessing / filtering steps were needed to transform the BNP Paribas dataframe to the following structure:
Structure of the Dataframe that should be sent to cdQA pipeline
If you use your own dataset, please be sure that your dataframe has such structure.
When using the CPU version of the model, each prediction takes between 10 and 20 seconds to be done. This moderate execution time is due to the BERT Reader, which is a very large deep learning model (~110M parameters). If you have a GPU, you can use directly the GPU version of the model
models/bert<em>qa</em>vGPU-sklearn.joblib. These pre-trained models are also available on the releases page of cdQA github: https://github.com/cdqa-suite/cdQA/releases
You can also improve the performance of the pre-trained Reader, which was pre-trained on SQuAD 1.1 dataset. If you have an annotated dataset (that can be generated by the help of the cdQA-annotator) in the same format as SQuAD dataset you can fine-tune the reader on it:
# Put the path to your json file in SQuAD format here path_to_data = './data/SQuAD_1.1/train-v1.1.json' cdqa_pipeline.fit_reader(path_to_data)
Please be aware that such fine-tuning should be performed using GPU as the BERT model is too large to be trained with CPU.
You can also check out other ways to do the same steps on the official tutorials: https://github.com/cdqa-suite/cdQA/tree/master/examplescdQA-annotator
In order to facilitate the data annotation, the team has built a web-based application, the cdQA-annotator.
In order to use it, you should have your dataset transformed to a JSON file with SQuAD-like format:
from cdqa.utils.converters import df2squad # Converting dataframe to SQuAD format json_data = df2squad(df=df, squad_version='v1.1', output_dir='.', filename='dataset-name.json')
Now you can install the annotator and run it:
# Clone the repo git clone https://github.com/cdqa-suite/cdQA-annotator # Install dependencies cd cdQA-annotator npm install # Start development server cd src vue serve
Now you can go to http://localhost:8080/ and after loading your JSON file you will see something like this:
To start annotating question-answer pairs you just need to write a question, highlight the answer with the mouse cursor (the answer will be written automatically), and then click on
Annotating question-answer pairs with cdQA-annotator
After the annotation, you can download it and use it to fine-tune the BERT Reader on your own data as explained in the previous section.cdQA-ui
The team also has provided a web-based user interface to couple with cdQA. In this section, I will describe how you can use de UI linked to the back-end of
First, you have to deploy a
cdQA REST API by executing on your shell (be sure you run it on
export dataset_path = 'path-to-dataset.csv' export reader_path = 'path-to-reader-model' FLASK_APP=api.py flask run -h 0.0.0.0
Second, you should proceed to the installation of the cdQA-ui package:
git clone https://github.com/cdqa-suite/cdQA-ui && cd cdQA-ui && npm install
Then, you start the develpoment server:
npm run serve
You can now access the web application on http://localhost:8080/. You will see something like the figure below:
Web application of cdQA-ui
As the application is well connected to the back-end, via the REST API, you can ask a question and the application will display an answer, the passage context where the answer was found and the title of the article:
Demonstration of the web application runningInserting the interface in a web-site
If you want to couple the interface on your website you just need do the following imports in your Vue app:
import Vue from 'vue' import CdqaUI from 'cdqa-ui' Vue.use(CdqaUI) import Vue from 'vue' import BootstrapVue from "bootstrap-vue" Vue.use(BootstrapVue) import "bootstrap/dist/css/bootstrap.css" import "bootstrap-vue/dist/bootstrap-vue.css"
Then you insert the cdQA interface component:Demo
You can also check out a demo of the application on the official website: https://cdqa-suite.github.io/cdQA-website/#demoConclusion
In this article, I presented
cdQA-suite, a software suite for the deployment of an end-to-end Closed Domain Question Answering System.
If you are interested in learning more about the project, feel free to check out the official GitHub repository: https://github.com/cdqa-suite. Do not hesitate to star and to follow the repositories if you liked the project and consider it valuable for you and your applications.
In this video, Deep Learning Tutorial with Python | Machine Learning with Neural Networks Explained, Frank Kane helps de-mystify the world of deep learning and artificial neural networks with Python!
Explore the full course on Udemy (special discount included in the link): http://learnstartup.net/p/BkS5nEmZg
In less than 3 hours, you can understand the theory behind modern artificial intelligence, and apply it with several hands-on examples. This is machine learning on steroids! Find out why everyone’s so excited about it and how it really works – and what modern AI can and cannot really do.
In this course, we will cover:
• Deep Learning Pre-requistes (gradient descent, autodiff, softmax)
• The History of Artificial Neural Networks
• Deep Learning in the Tensorflow Playground
• Deep Learning Details
• Introducing Tensorflow
• Using Tensorflow
• Introducing Keras
• Using Keras to Predict Political Parties
• Convolutional Neural Networks (CNNs)
• Using CNNs for Handwriting Recognition
• Recurrent Neural Networks (RNNs)
• Using a RNN for Sentiment Analysis
• The Ethics of Deep Learning
• Learning More about Deep Learning
At the end, you will have a final challenge to create your own deep learning / machine learning system to predict whether real mammogram results are benign or malignant, using your own artificial neural network you have learned to code from scratch with Python.
Separate the reality of modern AI from the hype – by learning about deep learning, well, deeply. You will need some familiarity with Python and linear algebra to follow along, but if you have that experience, you will find that neural networks are not as complicated as they sound. And how they actually work is quite elegant!
This is hands-on tutorial with real code you can download, study, and run yourself.
Thanks for reading ❤
If you liked this post, share it with all of your programming buddies!
This complete Machine Learning full course video covers all the topics that you need to know to become a master in the field of Machine Learning.
Machine Learning Full Course | Learn Machine Learning | Machine Learning Tutorial
It covers all the basics of Machine Learning (01:46), the different types of Machine Learning (18:32), and the various applications of Machine Learning used in different industries (04:54:48).This video will help you learn different Machine Learning algorithms in Python. Linear Regression, Logistic Regression (23:38), K Means Clustering (01:26:20), Decision Tree (02:15:15), and Support Vector Machines (03:48:31) are some of the important algorithms you will understand with a hands-on demo. Finally, you will see the essential skills required to become a Machine Learning Engineer (04:59:46) and come across a few important Machine Learning interview questions (05:09:03). Now, let's get started with Machine Learning.
Below topics are explained in this Machine Learning course for beginners:
Basics of Machine Learning - 01:46
Why Machine Learning - 09:18
What is Machine Learning - 13:25
Types of Machine Learning - 18:32
Supervised Learning - 18:44
Reinforcement Learning - 21:06
Supervised VS Unsupervised - 22:26
Linear Regression - 23:38
Introduction to Machine Learning - 25:08
Application of Linear Regression - 26:40
Understanding Linear Regression - 27:19
Regression Equation - 28:00
Multiple Linear Regression - 35:57
Logistic Regression - 55:45
What is Logistic Regression - 56:04
What is Linear Regression - 59:35
Comparing Linear & Logistic Regression - 01:05:28
What is K-Means Clustering - 01:26:20
How does K-Means Clustering work - 01:38:00
What is Decision Tree - 02:15:15
How does Decision Tree work - 02:25:15
Random Forest Tutorial - 02:39:56
Why Random Forest - 02:41:52
What is Random Forest - 02:43:21
How does Decision Tree work- 02:52:02
K-Nearest Neighbors Algorithm Tutorial - 03:22:02
Why KNN - 03:24:11
What is KNN - 03:24:24
How do we choose 'K' - 03:25:38
When do we use KNN - 03:27:37
Applications of Support Vector Machine - 03:48:31
Why Support Vector Machine - 03:48:55
What Support Vector Machine - 03:50:34
Advantages of Support Vector Machine - 03:54:54
What is Naive Bayes - 04:13:06
Where is Naive Bayes used - 04:17:45
Top 10 Application of Machine Learning - 04:54:48
How to become a Machine Learning Engineer - 04:59:46
Machine Learning Interview Questions - 05:09:03
In this post talks about the differences and relationship between Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL)
In this post talks about the differences and relationship between Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL)
**In this tutorial, you will learn: **
Artificial intelligence is imparting a cognitive ability to a machine. The benchmark for **AI **is the human intelligence regarding reasoning, speech, and vision. This benchmark is far off in the future.
AI has three different levels:
Early AI systems used pattern matching and expert systems.What is ML?
Machine learning is the best tool so far to analyze, understand and identify a pattern in the data. One of the main ideas behind Machine Learning is that the computer can be trained to automate tasks that would be exhaustive or impossible for a human being. The clear breach from the traditional analysis is that machine learning can take decisions with minimal human intervention.
Machine learning uses data to feed an algorithm that can understand the relationship between the input and the output. When the machine finished learning, it can predict the value or the class of new data point.What is Deep Learning?
Deep learning is a computer software that mimics the network of neurons in a brain. It is a subset of machine learning and is called deep learning because it makes use of deep neural networks. The machine uses *different *layers to *learn *from the data. The depth of the model is represented by the number of layers in the model. Deep learning is the new state of the art in term of AI. In deep learning, the learning phase is done through a neural network. A neural network is an architecture where the layers are stacked on top of each otherMachine Learning Process
Imagine you are meant to build a program that recognizes objects. To train the model, you will use a classifier. A classifier uses the features of an object to try identifying the class it belongs to.
In the example, the classifier will be trained to detect if the image is a:
The four objects above are the class the classifier has to recognize. To construct a classifier, you need to have some data as input and assigns a label to it. The algorithm will take these data, find a pattern and then classify it in the corresponding class.
This task is called supervised learning. In supervised learning, the training data you feed to the algorithm includes a label.
Training an algorithm requires to follow a few standard steps:
The first step is necessary, choosing the right data will make the algorithm success or a failure. The data you choose to train the model is called a feature. In the object example, the features are the pixels of the images.
Each image is a row in the data while each pixel is a column. If your image is a 28x28 size, the dataset contains 784 columns (28x28). In the picture below, each picture has been transformed into a feature vector. The label tells the computer what object is in the image.
The objective is to use these training data to classify the type of object. The first step consists of creating the feature columns. Then, the second step involves choosing an algorithm to train the model. When the training is done, the model will predict what picture corresponds to what object.
After that, it is easy to use the model to predict new images. For each new image feeds into the model, the machine will predict the class it belongs to. For example, an entirely new image without a label is going through the model. For a human being, it is trivial to visualize the image as a car. The machine uses its previous knowledge to predict as well the image is a car.Deep Learning Process
In deep learning, the *learning *phase is done through a neural network. A neural network is an architecture where the layers are stacked on top of each other.
Consider the same image example above. The training set would be fed to a neural network
Each input goes into a neuron and is multiplied by a weight. The result of the multiplication flows to the next layer and become the input. This process is repeated for each layer of the network. The final layer is named the output layer; it provides an actual value for the regression task and a probability of each class for the classification task. The neural network uses a mathematical algorithm to update the weights of all the neurons. The neural network is fully trained when the value of the weights gives an output close to the reality. For instance, a well-trained neural network can recognize the object on a picture with higher accuracy than the traditional neural net.Automate Feature Extraction using DL
A dataset can contain a dozen to hundreds of features. The system will learn from the relevance of these features. However, not all features are meaningful for the algorithm. A crucial part of machine learning is to find a relevant set of features to make the system learns something.
One way to perform this part in machine learning is to use feature extraction. Feature extraction combines existing features to create a more relevant set of features. It can be done with PCA, T-SNE or any other dimensionality reduction algorithms.
For example, an image processing, the practitioner needs to extract the feature manually in the image like the eyes, the nose, lips and so on. Those extracted features are feed to the classification model.
Deep learning solves this issue, especially for a convolutional neural network. The first layer of a neural network will learn small details from the picture; the next layers will combine the previous knowledge to make more complex information. In the convolutional neural network, the feature extraction is done with the use of the filter. The network applies a filter to the picture to see if there is a match, i.e., the shape of the feature is identical to a part of the image. If there is a match, the network will use this filter. The process of feature extraction is therefore done automatically.Difference between Machine Learning and Deep Learning When to use ML or DL?
In the table below, we summarize ***the difference between machine learning and deep learning. ***
With machine learning, you need fewer data to train the algorithm than deep learning. Deep learning requires an extensive and diverse set of data to identify the underlying structure. Besides, machine learning provides a faster-trained model. Most advanced deep learning architecture can take days to a week to train. The advantage of deep learning over machine learning is it is highly accurate. You do not need to understand what features are the best representation of the data; the neural network learned how to select critical features. In machine learning, you need to choose for yourself what features to include in the model.Summary
Artificial intelligence is imparting a cognitive ability to a machine. Early AI systems used pattern matching and expert systems.
The idea behind machine learning is that the machine can learn without human intervention. The machine needs to find a way to learn how to solve a task given the data.
Deep learning is the breakthrough in the field of artificial intelligence. When there is enough data to train on, **deep learning **achieves impressive results, especially for image recognition and text translation. The main reason is the feature extraction is done automatically in the different layers of the network.