Tyshawn  Braun

Tyshawn Braun

1602810000

What is Image Classification? Data Augmentation? Transfer Learning?

This article is the first part of three articles about computer vision. Part 2 will explain Object Recognition. Part 3 will be about Image Segmentation.

_With this article is provided a notebook: __here _on GitHub

Introduction

What is more exciting than seeing the world? To be able to see the best around us? The beauty of a sunset, the memorable waterfalls, or the seas of ice? Nothing would be possible if evolution hadn’t endowed us with eyes.

We recognize things because we have learned the shape of objects, we have learned to estimate that different shape from those we have encountered can be associated with the same object. We have learned by experience and because we were given the names of said objects. Like a supervised algorithm that needs a label to associate the shape, details, colors with a category. A dog and a wolf are very similar just across the pixels. Computer vision methods have enabled machines to be able to decipher these shapes and “learn” to classify them.

Now, algorithms, just like our eyes can identify in pictures or films, objects, or shapes. The methods are constantly evolving and perfecting to the point of reaching the so-called human level. But, there are several methods, image classification, object detection or recognition, and image segmentation. In this article, we will explore the image classification problem. The first part will present training a model from scratch, the second will present training with data augmentation, and the last transfer learning with pre-trained models.

Methods

Image Classification from scratch

Image classification can, when the volume of data you have is large enough, be done “from scratch”. The idea is to create a model and train it from scratch.

Like any classification problem, the data must be annotated. How to proceed when it comes to images? It’s quite simple in fact, the data of the same class must be stored in the same folder. It is necessary to take a folder per class or category considered. Like that:

> train/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg
      ...
  validation/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg
  test/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg

#deep-learning #image-classification #machine-learning #data-science #towards-data-science

What is GEEK

Buddha Community

What is Image Classification? Data Augmentation? Transfer Learning?
Jolie  Reichert

Jolie Reichert

1599215220

Why Does Image Data Augmentation Work As A Regularizer in Deep Learning?

The problem with deep learning models is they need lots of data to train a model. There are two major problems while training deep learning models is overfitting and underfitting of the model. Those problems are solved by data augmentation is a regularization technique that makes slight modifications to the images and used to generate data.

In this article, we will demonstrate why data augmentation is known as a regularization technique. How to apply data augmentation to our model and whether it is used as a preprocessing technique or post-processing techniques…? All these questions are answered in the below demonstration.

Topics that we will demonstrate in this article:-

  • Data augmentation as a regularizer and data generator.
  • Implementing Data augmentation techniques.

Data Augmentation As a Regularizer and Data Generator

The regularization is a technique used to reduce the overfitting in the model. unnecessarily. In dealing with deep learning models, too much learning is also bad for the model to make a prediction with unseen data. If we get good results in training data and poor results in unseen data (test data, validation data) then it is framed as an overfitting problem. So now using data augmentation, we perform few transformations to the data like flipping, cropping, adding noise to the data, etc.

As you know, deep learning models are data hungry, if we are lacking data then by using data augmentation transformations of the image we can generate data. Data augmentation is a preprocessing technique because we only work on the data to train our model. In this technique, we generate new instances of images by cropping, flipping, zooming, shearing an original image. So, whenever the training lacks the image dataset, using augmentation, we can create thousands of images to train the model perfectly.


#developers corner #computer vision #data augmentation #deep learning #image augmentation #image data augmentation #image processing #overfitting

Jerad  Bailey

Jerad Bailey

1598891580

Google Reveals "What is being Transferred” in Transfer Learning

Recently, researchers from Google proposed the solution of a very fundamental question in the machine learning community — What is being transferred in Transfer Learning? They explained various tools and analyses to address the fundamental question.

The ability to transfer the domain knowledge of one machine in which it is trained on to another where the data is usually scarce is one of the desired capabilities for machines. Researchers around the globe have been using transfer learning in various deep learning applications, including object detection, image classification, medical imaging tasks, among others.

#developers corner #learn transfer learning #machine learning #transfer learning #transfer learning methods #transfer learning resources

Siphiwe  Nair

Siphiwe Nair

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Tyshawn  Braun

Tyshawn Braun

1602810000

What is Image Classification? Data Augmentation? Transfer Learning?

This article is the first part of three articles about computer vision. Part 2 will explain Object Recognition. Part 3 will be about Image Segmentation.

_With this article is provided a notebook: __here _on GitHub

Introduction

What is more exciting than seeing the world? To be able to see the best around us? The beauty of a sunset, the memorable waterfalls, or the seas of ice? Nothing would be possible if evolution hadn’t endowed us with eyes.

We recognize things because we have learned the shape of objects, we have learned to estimate that different shape from those we have encountered can be associated with the same object. We have learned by experience and because we were given the names of said objects. Like a supervised algorithm that needs a label to associate the shape, details, colors with a category. A dog and a wolf are very similar just across the pixels. Computer vision methods have enabled machines to be able to decipher these shapes and “learn” to classify them.

Now, algorithms, just like our eyes can identify in pictures or films, objects, or shapes. The methods are constantly evolving and perfecting to the point of reaching the so-called human level. But, there are several methods, image classification, object detection or recognition, and image segmentation. In this article, we will explore the image classification problem. The first part will present training a model from scratch, the second will present training with data augmentation, and the last transfer learning with pre-trained models.

Methods

Image Classification from scratch

Image classification can, when the volume of data you have is large enough, be done “from scratch”. The idea is to create a model and train it from scratch.

Like any classification problem, the data must be annotated. How to proceed when it comes to images? It’s quite simple in fact, the data of the same class must be stored in the same folder. It is necessary to take a folder per class or category considered. Like that:

> train/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg
      ...
  validation/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg
  test/
      ... forest/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... moutain/
            ... img_1.jpeg
            ... img_2.jpeg
            ...
      ... sea/
            ... img_1.jpeg
            ... img_2.jpeg

#deep-learning #image-classification #machine-learning #data-science #towards-data-science

Transfer Learning in Image Classification

The term Transfer Learning refers to the leverage of knowledge gained by a Neural Network trained on a certain (usually large) available dataset for solving new tasks for which few training examples are available, integrating the existing knowledge with the new one learned from the few examples of the task-specific dataset. Transfer Learning is thus commonly used, often together with other techniques such as Data Augmentation, in order to address the problem of lack of training data.

But, in practice, how much can Transfer Learning actually help, and how many training examples do we really need in order for it to be effective?

In this story, I try to answer these questions by applying two Transfer Learning techniques (e.g. Feature Extraction and Fine-Tuning) for addressing an Image Classification task, varying the number of examples on which the models are trained in order to see how the lack of data affects the effectiveness of the adopted approaches.


Experimental Case Study

The task chosen for experimenting Transfer Learning consists of the classification of flower images into 102 different categories. The choice of this task is mainly due to the easy availability of a flowers dataset, as well as to the domain of the problem, which is generic enough to be suitable for effectively applying Transfer Learning with neural networks pre-trained on the well-known ImageNet dataset.

The adopted dataset is the 102 Category Flower Dataset created by M. Nilsback and A. Zisserman [3], which is a collection of 8189 labelled flowers images belonging to 102 different classes. For each class, there are between 40 and 258 instances and all the dataset images have significant scale, pose and light variations. The detailed list of the 102 categories together with the respective number of instances is available here.

Figure 1: Examples of images extracted from the 102 Category Dataset.

In order to create training datasets of different sizes and evaluate how they affect the performance of the trained networks, the original set of flowers images is split into training, validation and test sets several times, each time adopting different split percentages. Specifically, three different training sets are created (that from now on will be referred to as the LargeMedium and Small training sets) using the percentages shown in the table below.

Table 1: number of examples and split percentages (referred to the complete unpartitioned flowers dataset) of the datasets used to perform the experiments.

All the splits are performed adopting stratified sampling, in order to avoid introducing sampling biases and ensuring in this way that all the obtained training, validation and test subsets are representative of the whole initial set of images.

Adopted strategies

The image classification task described above is addressed by adopting the two popular techniques that are commonly used when applying Transfer Learning with pre-trained CNNs, namely Feature Extraction and Fine-Tuning.

Feature Extraction

Feature Extraction basically consists of taking the convolutional base of a previously trained network, running the target data through it and training a new classifier on top of the output, as summarized in the figure below.

Figure 2: Feature Extraction applied to a convolutional neural network: the classifiers are swapped while the same convolutional base is kept. “Frozen” means that the weighs are not updated during training.

The classifier stacked on top of the convolutional base can either be a stack of fully-connected layers or just a single Global Pooling layer, both followed by Dense layer with softmax activation function. There is no specific rule regarding which kind of classifier should be adopted, but, as described by Lin et. al [2], using just a single Global Pooling layer generally leads to less overfitting since in this layer there are no parameters to optimize.

Consequently, since the training sets used in the experiments are relatively small, the chosen classifier only consists of a single Global Average Pooling layer which output is fed directly into a softmax activated layer that outputs the probabilities for each of the 102 flowers categories.

During the training, only the weights of the top classifiers are updated, while the weights of the convolutional base are “frozen” and thus kept unchanged.

In this way, the shallow classifier learns how to classify the flower images into the possible 102 categories from the off-the-shelf representations previously learned by the source model for its domain. If the source and the target domains are similar, then these representations are likely to be useful to the classifier and the transferred knowledge can thus bring an improvement to its performance once it is trained.

Fine-Tuning

Fine-Tuning can be seen as a further step than Feature Extraction that consists of selectively retraining some of the top layers of the convolutional base previously used for extracting features. In this way, the more abstract representations of the source model learned by its last layers are slightly adjusted to make them more relevant for the target problem.

This can be achieved by unfreezing some of the top layers of the convolutional base, keeping frozen all its other layers and jointly training the convolutional base with the same classifier previously used for Feature Extraction, as represented in the figure below.

Figure 3: Feature Extraction compared to Fine-Tuning.

It is important to point out that, according to F. Chollet, the top layers of a pre-trained convolutional base can be fine-tuned only if the classifier on top of it has already been previously trained. The reason is that if the classifier was not already trained, then its weights would be randomly initialized. As a consequence, the error signal propagating through the network during training would be too large and the unfrozen weights would be updated disrupting the abstract representations previously learned by the convolutional base.

#deep-learning #machine-learning #artificial-intelligence #image-classification #transfer-learning #deep learning