In one of my last blog post, How to fine-tune bert on text classification task_, _I had explained fine-tuning BERT for a **multi-class **text classification task. In this post, I will be explaining how to fine-tune DistilBERT for a multi-label text classification task. I have made a GitHub repo as well containing the complete code which is explained below. You can visit the below link to see it and can fork it and use it.
The DistilBERT model (https://arxiv.org/pdf/1910.01108.pdf) was released by Huggingface.co which is a distilled version of BERT released by Google (https://arxiv.org/pdf/1810.04805.pdf).
According to the authors:-
They leverage knowledge distillation during the pre-training phase and show that it is possible to reduce the size of a BERT model by 40% while retaining 97% of its language understanding capabilities and being 60% faster.
So let’s start with the details and the process to fine-tune the model.
#bert #machine-learning #nlp #deep-learning #multilabel-classifier