Noah  Rowe

Noah Rowe

1596681180

Multi Class Text Classification With Deep Learning Using BERT

Most of the researchers submit their research papers to academic conference because its a faster way of making the results available. Finding and selecting a suitable conference has always been challenging especially for young researchers.

However, based on the previous conferences proceeding data, the researchers can increase their chances of paper acceptance and publication. We will try to solve this text classification problem with deep learning using BERT.

Almost all the code were taken from this tutorial, the only difference is the data.

The Data

The dataset contains 2,507 research paper titles, and have been manually classified into 5 categories (i.e. conferences) that can be downloaded from here.

Explore and Preprocess

import torch
	from tqdm.notebook import tqdm

	from transformers import BertTokenizer
	from torch.utils.data import TensorDataset

	from transformers import BertForSequenceClassification

	df = pd.read_csv('data/title_conference.csv')
	df.head()
view raw
conf_explore.py hosted with ❤ by GitHub

conf_explore.py

Image for post

Table 1

df['Conference'].value_counts()

Image for post

Figure 1

You may have noticed that our classes are imbalanced, and we will address this later on.

#machine-learning #nlp #document-classification #nlp-tutorial #text-classification #deep learning

What is GEEK

Buddha Community

Multi Class Text Classification With Deep Learning Using BERT
Noah  Rowe

Noah Rowe

1596681180

Multi Class Text Classification With Deep Learning Using BERT

Most of the researchers submit their research papers to academic conference because its a faster way of making the results available. Finding and selecting a suitable conference has always been challenging especially for young researchers.

However, based on the previous conferences proceeding data, the researchers can increase their chances of paper acceptance and publication. We will try to solve this text classification problem with deep learning using BERT.

Almost all the code were taken from this tutorial, the only difference is the data.

The Data

The dataset contains 2,507 research paper titles, and have been manually classified into 5 categories (i.e. conferences) that can be downloaded from here.

Explore and Preprocess

import torch
	from tqdm.notebook import tqdm

	from transformers import BertTokenizer
	from torch.utils.data import TensorDataset

	from transformers import BertForSequenceClassification

	df = pd.read_csv('data/title_conference.csv')
	df.head()
view raw
conf_explore.py hosted with ❤ by GitHub

conf_explore.py

Image for post

Table 1

df['Conference'].value_counts()

Image for post

Figure 1

You may have noticed that our classes are imbalanced, and we will address this later on.

#machine-learning #nlp #document-classification #nlp-tutorial #text-classification #deep learning

Marget D

Marget D

1618317562

Top Deep Learning Development Services | Hire Deep Learning Developer

View more: https://www.inexture.com/services/deep-learning-development/

We at Inexture, strategically work on every project we are associated with. We propose a robust set of AI, ML, and DL consulting services. Our virtuoso team of data scientists and developers meticulously work on every project and add a personalized touch to it. Because we keep our clientele aware of everything being done associated with their project so there’s a sense of transparency being maintained. Leverage our services for your next AI project for end-to-end optimum services.

#deep learning development #deep learning framework #deep learning expert #deep learning ai #deep learning services

Vern  Greenholt

Vern Greenholt

1595046000

How to fine-tune BERT on text classification task?

BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based architecture released in the paper Attention Is All You Need**_” in the year 2016 by Google. The BERT model got published in the year 2019 in the paper — “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. _**When it was released, it showed the state of the art results on GLUE benchmark.

Introduction

First, I will tell a little bit about the Bert architecture, and then will move on to the code on how to use is for the text classification task.

The BERT architecture is a multi-layer bidirectional transformer’s encoder described in the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.

There are two different architecture’s proposed in the paper. **BERT_base **and **BERT_large. **The BERT base architecture has L=12, H=768, A=12 and a total of around 110M parameters. Here L refers to the number of transformer blocks, H refers to the hidden size, A refers to the number of self-attention head. For BERT largeL=24, H=1024, A=16.


BERT: State of the Art NLP Model, Explained

Source:- https://www.kdnuggets.com/2018/12/bert-sota-nlp-model-explained.html

The input format of the BERT is given in the above image. I won’t get into much detail into this. You can refer the above link for a more detailed explanation.

Source Code

The code which I will be following can be cloned from the following HuggingFace’s GitHub repo -

https://github.com/huggingface/transformers/

Scripts to be used

Majorly we will be modifying and using two scripts for our text classification task. One is **_glue.py, _**and the other will be **_run_glue.py. _**The file glue.py path is “_transformers/data/processors/” _and the file run_glue.py can be found in the location “examples/text-classification/”.

#deep-learning #machine-learning #text-classification #bert #nlp #deep learning

Mikel  Okuneva

Mikel Okuneva

1603735200

Top 10 Deep Learning Sessions To Look Forward To At DVDC 2020

The Deep Learning DevCon 2020, DLDC 2020, has exciting talks and sessions around the latest developments in the field of deep learning, that will not only be interesting for professionals of this field but also for the enthusiasts who are willing to make a career in the field of deep learning. The two-day conference scheduled for 29th and 30th October will host paper presentations, tech talks, workshops that will uncover some interesting developments as well as the latest research and advancement of this area. Further to this, with deep learning gaining massive traction, this conference will highlight some fascinating use cases across the world.

Here are ten interesting talks and sessions of DLDC 2020 that one should definitely attend:

Also Read: Why Deep Learning DevCon Comes At The Right Time


Adversarial Robustness in Deep Learning

By Dipanjan Sarkar

**About: **Adversarial Robustness in Deep Learning is a session presented by Dipanjan Sarkar, a Data Science Lead at Applied Materials, as well as a Google Developer Expert in Machine Learning. In this session, he will focus on the adversarial robustness in the field of deep learning, where he talks about its importance, different types of adversarial attacks, and will showcase some ways to train the neural networks with adversarial realisation. Considering abstract deep learning has brought us tremendous achievements in the fields of computer vision and natural language processing, this talk will be really interesting for people working in this area. With this session, the attendees will have a comprehensive understanding of adversarial perturbations in the field of deep learning and ways to deal with them with common recipes.

Read an interview with Dipanjan Sarkar.

Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER

By Divye Singh

**About: **Imbalance Handling with Combination of Deep Variational Autoencoder and NEATER is a paper presentation by Divye Singh, who has a masters in technology degree in Mathematical Modeling and Simulation and has the interest to research in the field of artificial intelligence, learning-based systems, machine learning, etc. In this paper presentation, he will talk about the common problem of class imbalance in medical diagnosis and anomaly detection, and how the problem can be solved with a deep learning framework. The talk focuses on the paper, where he has proposed a synergistic over-sampling method generating informative synthetic minority class data by filtering the noise from the over-sampled examples. Further, he will also showcase the experimental results on several real-life imbalanced datasets to prove the effectiveness of the proposed method for binary classification problems.

Default Rate Prediction Models for Self-Employment in Korea using Ridge, Random Forest & Deep Neural Network

By Dongsuk Hong

About: This is a paper presentation given by Dongsuk Hong, who is a PhD in Computer Science, and works in the big data centre of Korea Credit Information Services. This talk will introduce the attendees with machine learning and deep learning models for predicting self-employment default rates using credit information. He will talk about the study, where the DNN model is implemented for two purposes — a sub-model for the selection of credit information variables; and works for cascading to the final model that predicts default rates. Hong’s main research area is data analysis of credit information, where she is particularly interested in evaluating the performance of prediction models based on machine learning and deep learning. This talk will be interesting for the deep learning practitioners who are willing to make a career in this field.


#opinions #attend dldc 2020 #deep learning #deep learning sessions #deep learning talks #dldc 2020 #top deep learning sessions at dldc 2020 #top deep learning talks at dldc 2020

Multi-Class Image Classification

Introduction

Computer is an amazing machine (no doubt in that) and I am really mesmerized by the fact how computers are able to learn and classify Images. Image classification have it’s own advantages and application in various ways, for example, we can buid a pet food dispenser based on which species (cat or dog) is approaching it. I know it’s a wierd idea like they will end up eating all of the food but the system can be time controlled and can be dispensed only once. Anyways let’s move further before getting distracted and continue our discussion. So after upskilling myself with the knowledge of Deep Learning Neural Networks, I thought of building one myself. So here I am going to share building an Alexnet Convolutional Neural Network for 6 different classes built from scratch using Keras and coded in Python.


Overview of AlexNet

Before getting to AlexNet , it is recommended to go through the Wikipedia article on Convolutional Neural Network Architecture to understand the terminologies in this article. Let’s dive in to get a basic overview of the AlexNet network

AlexNet[1]is a Classic type of Convolutional Neural Network, and it came into existence after the 2012 ImageNet challenge. The network architecture is given below :

Image for post

AlexNet Architecture (courtesy of Andrew Ng on Coursera[2])

**Model Explanation : **The Input to this model have the dimensions 227x227x3 follwed by a Convolutional Layer with 96 filters of 11x11 dimensions and having a ‘same’ padding and a stride of 4. The resulting output dimensions are given as :

floor(((n + 2padding - filter)/stride) + 1 ) * floor(((n + 2padding — filter)/stride) + 1)

**Note : **This formula is for square input with height = width = n

Explaining the first Layer with input 227x227x3 and Convolutional layer with 96 filters of 11x11 , ‘valid’ padding and stride = 4 , output dims will be

= floor(((227 + 0–11)/4) + 1) * floor(((227 + 0–11)/4) + 1)

= floor((216/4) + 1) * floor((216/4) + 1)

= floor(54 + 1) * floor(54 + 1)

= 55 * 55

Since number of filters = 96 , thus output of first Layer is : 55x55x96

Continuing we have the MaxPooling layer (3, 3) with the stride of 2,making the output size decrease to 27x27x96, followed by another Convolutional Layer with 256, (5,5) filters and ‘same’ padding, that is, the output height and width are retained as the previous layer thus output from this layer is 27x27x256. Next we have the MaxPooling again ,reducing the size to 13x13x256. Another Convolutional Operation with 384, (3,3) filters having same padding is applied twice giving the output as 13x13x384, followed by another Convulutional Layer with 256 , (3,3) filters and same padding resulting in 13x13x256 output. This is MaxPooled and dimensions are reduced to 6x6x256. Further the layer is Flatten out and 2 Fully Connected Layers with 4096 units each are made which is further connected to 1000 units softmax layer. The network is used for classifying much large number of classes as per our requirement. However in our case, we will make the output softmax layer with 6 units as we ahve to classify into 6 classes. The softmax layer gives us the probablities for each class to which an Input Image might belong.


Implementing AlexNet using Keras

Keras is an API for python, built over Tensorflow 2.0,which is scalable and adapt to deployment capabilities of Tensorflow [3]. We will Build the Layers from scratch in Python using Keras API.

First, lets Import the essentials libraries

import numpy as np
from keras import layers
from keras.layers import Input, Dense, Activation,BatchNormalization, Flatten, Conv2D, MaxPooling2D
from keras.models import Model
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
import keras.backend as K
K.set_image_data_format(‘channels_last’)
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow

In this article we will use the Image Generator to build the Classifier. Next we will import the data using Image Data Generator. Before that let’s understand the Data. The dataset can be found here.

This Data contains around 25k images of size 150x150 distributed under 6 categories, namely : ‘buildings’ , ‘forest’ , ‘glacier’ , ‘mountain’ , ‘sea’ , ‘street’ . There are 14K images in training set, 3K in test setand 7K in Prediction set.

The data images for all the categories are split into it’s respective directories, thus making it easy to infer the labels as according to keras documentation[4]

Arguments :

directory: Directory where the data is located. If _labels_ is “inferred”, it should contain subdirectories, each containing images for a class. Otherwise, the directory structure is ignored.

In the linked dataset also, we have a directory structure and thus the ImageDataGenerator will infer the labels. A view of dataset directory structure is shown below :

Image for post

Directory structure in dataset

Next we will import the dataset as shown below :

path = 'C:\\Users\\Username\\Desktop\\folder\\seg_train\\seg_train'
train_datagen = ImageDataGenerator(rescale=1\. / 255)
train = train_datagen.flow_from_directory(path, target_size=(227,227), class_mode='categorical')

Output

Found 14034 images belonging to 6 classes.

As explained above, the input size for AlexNet is 227x227x3 and so we will change the target size to (227,227). The by default Batch Size is 32. Lets see the type of train and train_datagen.

Image for post

The type keras.preprocessing.image.DirectoryIterator is an Iterator capable of reading images from a directory on disk[5]. The keras.preprocessing.image.ImageDataGenerator generate batches of tensor image data with real-time data augmentation. The by default batch_size is 32

Next let us check the dimensions of the first image and its associated output in the first batch.

#deep-learning #keras #alexnet #classification #machine-learning #deep learning