Is zero closer to eight or to one? Is this a three or a five? This was the type of question we were pondering a few weeks ago when we examined the results of an image classification application.

Yes, indeed, a zero is closer to an eight than to a one and a two is closer to a five than to a three — of course, from an image recognition point of view rather than in a strictly mathematical sense. In the last data science example that we were preparing, we trained a machine learning model to recognize images of hand-written digits. In the end, while checking the results, we realized how sloppy people’s handwriting is and how hard it is sometimes to distinguish an eight from a zero, a two from a five, a one from a seven, a zero from a three, and other, sometimes unexpected, similar digits.

For obvious reasons involving time and other obligations, we could not go through all digit images and their misclassifications, one by one. Nevertheless, it would have been interesting to have had an overview of the most confusable overlapping digits in the dataset. That is easier said than done. Each data row is an image, described through 784 input features. Try to visualize that!

We needed a massive reduction of the input dimensionality from 784 to 2 or maximum 3 so that data visualization via a 2D or a 3D plot becomes possible and yet also still informative. It would be great indeed if we could understand from a plot or a chart where the biggest confusion lies and which digits have the largest stretch in handwriting styles.

#artificial-intelligence #dimensionality-reduction #data-science #knime #dataviz

Is Zero Closer to Eight or to One?
1.45 GEEK