As data scientists, we deal with incoming data in a wide variety of formats. When it comes to loading image data with PyTorch, the ImageFolder class works very nicely, and if you are planning on collecting the image data yourself, I would suggest organizing the data so it can be easily accessed using the ImageFolder class.

However, life isn’t always easy. Dealing with other data formats can be challenging, especially if it requires you to write a custom PyTorch class for loading a dataset (dun dun dun…… enter the dictionary sized documentation and its henchmen — the “beginner” examples).

In reality, defining a custom class doesn’t have to be that difficult! Here I will show you exactly how to do that, even if you have very little experience working with Python classes.

My motivation for writing this article is that many online or university courses about machine learning (understandably) skip over the details of loading in data and take you straight to formatting the core machine learning code. Although that’s great, many beginners struggle to understand how to load in data when it comes time for their first independent project.

If your machine learning software is a hamburger, the ML algorithms are the meat, but just as important are the top bun (being importing & preprocessing data), and the bottom bun (being predicting and deploying the model). I hope you’re hungry because today we will be making the top bun of our hamburger!

#python #pytorch #data-science #machine-learning #developer

Beginner’s Guide to Loading Image Data with PyTorch
2.15 GEEK