The JIMP npm package provides us with methods to compare image files for the purposes of identifying inadvertent duplication or deliberate plagiarism. In this article I will demonstrate how to use them, and along the way we will find out just how similar two images must be to be considered the same.
JIMP is the JavaScript Image Manipulation Program, and you can read the full documentation on its npm page.
If you just want to install it for this project then run:
npm install --save jimp
I will be using three methods for comparing images:
hash
: this returns a 64 bit perceptual hash of an image. Unlike the cryptographic hashes you might be familiar with, perceptual hashes vary in a way roughly proportional to the differences in input, so the hashes of similar images will also be similar.distance
: the Hamming distance between the hashes of two images, ie. the number of bits which differ.diff
: the percentage difference between two images.The JIMP documentation linked to above recommends using both distance
and diff
to compare images. If either are less than 0.15 then the images can be considered to be the same. They claim 99% success with 1% false positives.
However, there were a few unanswered questions in my mind about this process:
#nodejs #jimp #image-processing #javascript