Digital signatures are on the rise. Since many of us are working now from home, a lot of confidential company E-Mails need to be signed online.

Ian Goodfellows’ invention of Generative Adversarial Networks (GAN’s) showed how easy it is nowadays to generate fake numbers on the MNIST dataset. It is actually just a tiny step from that, to also be able to generate imitated signatures with the handwriting style of any person. But isn’t that dangerous?

Can we distinguish with Machine Learning between an original and an artificially crafted signature? Indeed we can! We don’t necessarily even need one of those fancy neural network approaches, we can go totally classic with Hidden Markov Models (HMM). I will show in this post how we can incorporate HMM’s to classify whether a signature was original or imitated.

This project is loosely inspired by the paper of Julian Fierrez et. al. published 2007, called: HMM-Based On-Line Signature Verification.


Even though Hidden Markov Models are not state-of-the-art, as the publishing date “2007” of the paper above already suggests, they are still a fundamental concept every Data Scientist should at least have heard about. Understanding the way an HMM works can be enlightening when you want to understand more recent technologies like Recurrent Neural Networks, because many techniques have evolved out of the HMM’s basic idea.

Hidden Markov Models

Hidden Markov Models can include time dependency in their computations. In Figure 1 below we can see, that from each state (Rainy, Sunny) we can transit into Rainy or Sunny back and forth and each of them has a certain probability to emit the three possible output states at every time step (Walk, Shop, Clean). The start probability always needs to be provided as well (60% for Rainy and 40% for Sunny) to start the computational chain.

#gaussian-mixture-model #python #handwriting #biometrics

Biometric Signal Verification of Handwriting With HMM’s
2.00 GEEK