End-to-end example to explain how to fine-tune the Hugging Face model with a custom dataset using TensorFlow and Keras. I show how to save/load the trained model and execute the predict function with tokenized input.

There are many articles about Hugging Face fine-tuning with your own dataset. Many of the articles are using PyTorch, some are with TensorFlow. I had a task to implement sentiment classification based on a custom complaints dataset. I decided to go with Hugging Face transformers, as results were not great with LSTM. Despite a large number of available articles, it took me significant time to bring all bits together and implement my own model with Hugging Face trained with TensorFlow. It seems like most, if not all, articles stop when training is explained. I thought it would be useful to share a complete scenario and explain how to save/load the trained model and execute inference. This post is based on Hugging Face API for TensorFlow.

Your starting point should be Hugging Face documentation. There is a very helpful section — Fine-tuning with custom datasets. To understand how to fine-tune Hugging Face model with your own data for sentence classification, I would recommend studying code under this section — Sequence Classification with IMDb Reviews. Hugging Face documentation provides examples for both PyTorch and TensorFlow, which is very convenient.

I’m using TFDistilBertForSequenceClassification class to run sentence classification. About DistilBERT — DistilBERT is a small, fast, cheap and light Transformer model trained by distilling Bert base. It has 40% fewer parameters than bert-base-uncased, runs 60% faster while preserving over 95% of Bert’s performances as measured on the GLUE language understanding benchmark.

from transformers import DistilBertTokenizerFast
from transformers import TFDistilBertForSequenceClassification

import tensorflow as tf

Fine-Tuning Hugging Face Model with Custom Dataset
