a simple way to protect your computer RAM from overloading and promise your DNN training’s success on a huge image dataset.

Image for post

photo by author

Background

Dealing with large image datasets, computer memory can be easily overloaded. Some people don’t have an idea about how large an image dataset could be. The MNIST dataset, although each handwritten digit is in 28x28, is composed of a training set of 60,000 examples, and a test set of 10,000 examples. It doesn’t require too much hard drive capacity for the downloaded dataset. But when we read the dataset into Numpy array, too much memory (RAM) will be taken. Instead of an array output, an error message “run out of memory” appears on the screen. What is worse, with the development of Data Science, the size of datasets for our researches is growing up. The COCO dataset, the Cityscapes dataset, etc. need much larger both hard drive capacity and memory.

It seems that we have to buy better and more expensive equipment to struggle with limited computer memory. Otherwise, we can’t proceed these huge image datasets.

Image for post

Machine Learning Jobs

On the other hand, deep learning algorithms require a great many computer calculations, which could also run out of computer memory. The classification, detection, segmentation algorithms of Computer Vision with DNN handle with enormous data volume. The more train data, the better our result. Though we have access to big datasets, like Pascal VOC, COCO, Cityscapes, which are often and free for everyone. Our poor RAM doesn’t allow our processing of huge data. Either the dataset size or RAM chokes my deep learning like a force choke👹.

Image for post

#dataset #computer-vision #python #deep-learning #ai

How Dataset size or RAM chokes your Deep Learning for Computer Vision
1.35 GEEK