Prelude

It’s a foggy morning, you’ve forgotten your glasses inside but theres no time to go back and get them. You head down to where you’re bike is locked, not noticing that some hooligan has surreptitiously replaced it with a tiger. After a quick trip to the hospital, you determine never to confuse a bike for a tiger ever again. Luckily for you, with a little tensorflow and a little PIL, you can teach your computer to tell the difference between bikes and tigers( or lions, sharks, really anything those hooligans might try and slip by).

The technique we’ll be leveraging to accomplish this is neural networks. We’ll be scraping data from google images, specifically pictures of bikes and tiger, doing some processing on them with PIL, and using them to train a tensorflow neural network.

Background

A neural network, as its name might suggest, is a technique for making computers learn from data, modeled on how we think the brain might learn from data. The classical use case for a neural network is teaching a computer how to recognize hand drawn digits. Though it might seem blindingly obvious to us, it’s not at all clear from the outset how we might teach a computer to recognize some pattern as a 3 and some other as a 4. For a proper explanation for the mathematical intuition I’d recommend 3Blue1Browns great 4 part series on the topic (https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3p). Like most machine learning techniques, neural networks use lots of training data to try and “learn”. In the classical example we might feed a computer lots and lots of hand photos drawn digits. In our case we need to find lots of pictures of bikes and tigers and give to our model, to work its magic.

Web scraping

First things first we need some data. We’ll be using a program to automatically pull and download around 170 pictures of bikes and tigers from google images. Code for this extraction is courtesy of

Image Scraping with Python

A code-along guide to learn how to download images from Google with Python!

towardsdatascience.com

from selenium import webdriver
import requests
import os
import io
import hashlib
from bs4 import BeautifulSoup
from PIL import Image
import time

###Image scraping with python code goes here####

search_and_download(search_term="tiger",driver_path=DRIVER_PATH,number_images=170)
#gets 170 images of tigers from google images, saves to folder called tiger, in folder called image, in our working directory

search_and_download(search_term="bike",driver_path=DRIVER_PATH,number_images=170)
#Same for bikes

To get our data we input a search term and a number of images into the download function. After it runs we should have a folder named tiger of 170 tiger jpg’s and bike folder of 170 bike jpgs, randomly named and formatted. Note, how many values we can scrape using this algorithm before it gets stuck depends on search term. For these terms we were able to extract around 170. Playing with the “sleep between interactions’ argument in the reference code can help increase this number. Google doesn’t like when we extract jpegs to quickly, so deliberately increasing our processing time can help us get more data without throwing up any alerts. Now that we have raw image data our next step is to do some processing on these images with the PIL module, to get them into a usable form.

PIL image processing

Our objective here is to convert our images into usable bundles of data. For each image, our neural network model wants as input a feature array and a label array. The feature array is essentially an array of numbers corresponding to each pixel of our data. For color data this means 3 numbers corresponding the RGB value, for each pixel. For grayscale, as we’ll be using, it only takes 1 number, the brightness.Label array is just a 1 dimensional array with each images label, a numerical level which corresponds to that images category. In our case, tiger images get levels 1 and bike images level 0. Before we can get these arrays we have to convert all our images to consistent processable form. We start by making them all black and white. To convert a jpg image to grayscale we run:

import PIL
img= Image.open(jpg)
img = img.convert('L')

Then, we convert all images to some common shape and resolution, since the model expects the same amount of data, ie the same amount of pixels, to come from each image. Here we can also adjust from much data our computer can handle. A 1000x1000 image contains 1 million pixels, thats 1 million pieces of data for our model to transform and process per 1 image. We adjust the resolution to match our computational capacity, playing around with the numbers is generally the easiest approach here. We can use PIL’s “crop” and “size” and “resize” functions to get all this done, as follows. First, to crop our image into a square.

#aritificial-intelligence #neural-networks #machine-learning #deep learning

Teaching a computer the difference between a tiger
1.75 GEEK