Contrastive Learning of General-Purpose Audio Representations.

This post is a short summary and steps to implement the following paper:
Learning of General-Purpose Audio Representations
The objective of this paper is to learn self-supervised general-purpose audio representations using Discriminative Pre-Training. The authors train a 2D CNN EfficientNet-B0 to transform Mel-spectrograms into 1D-512 vectors. Those representations are then transferred to other tasks like Speaker Identification or Bird Song detection.

#deep-learning #machine-learning #audio #pytorch #unsupervised-learning

towardsdatascience.com

Contrastive Learning of General-Purpose Audio Representations.