Custom datasets!! WHY??

Because you can shape it in a way you desire!!!

It is natural that we will develop our way of creating custom datasets while dealing with different Projects.

There are some official custom dataset examples on PyTorch Like here but it seemed a bit obscure to a beginner (like me, back then). The topics which we will discuss are as follows.

  1. Custom Dataset Fundamentals.
  2. Using Torchvision Transforms.
  3. Dealing with pandas (read_csv)
  4. Embedding Classes into File Names
  5. Using DataLoader

1. Custom Dataset Fundamentals.

A dataset must contain the following functions to be used by DataLoader later on.

  • __init__() function, the initial logic happens here, like reading a CSV, assigning transforms, filtering data, etc.,
  • __getitem__() returns the data and the labels.
  • __len__() returns the count of samples your dataset has.

Now, the first part is to create a dataset class:

from torch.utils.data.dataset import Dataset

class MyCustomDataset(Dataset):
    def __init__(self, ...):
        ## stuff

    def __getitem__(self, index):
        ## stuff
        return (img, label)

    def __len__(self):
        return count ## of how many examples(images?) you have

#pytorch #deep-learning #dataloader #machine-learning #data-science #deep learning

Dealing with PyTorch Custom Datasets
8.75 GEEK