Learn how to build a neural network and how to train, evaluate and optimize it with TensorFlow

**Deep learning** is a subfield of machine learning that is a set of algorithms that is inspired by the structure and function of the brain.

TensorFlow is the second *machine learning framework* that **Google** created and used to design, build, and train deep learning models. You can use the *TensorFlow library* do to numerical computations, which in itself doesn’t seem all too special, but these computations are done with data flow graphs. In these graphs, nodes represent mathematical operations, while the edges represent the data, which usually are multidimensional data arrays or tensors, that are communicated between these edges.

You see? The name “TensorFlow” is derived from the operations which neural networks perform on multidimensional data arrays or tensors! It’s literally a flow of tensors. For now, this is all you need to know about tensors, but you’ll go deeper into this in the next sections!

Today’s TensorFlow tutorial for beginners will introduce you to performing deep learning in an interactive way:

- You’ll first learn more about tensors;
- Then, the tutorial you’ll briefly go over some of the ways that you can install TensorFlow on your system so that you’re able to get started and load data in your workspace;
- After this, you’ll go over some of the TensorFlow basics: you’ll see how you can easily get started with simple computations.
- After this, you get started on the real work: you’ll load in data on Belgian traffic signs and exploring it with simple statistics and plotting.
- In your exploration, you’ll see that there is a need to manipulate your data in such a way that you can feed it to your model. That’s why you’ll take the time to rescale your images and convert them to grayscale.
- Next, you can finally get started on your neural network model! You’ll build up your model layer per layer;
- Once the architecture is set up, you can use it to train your model interactively and to eventually also evaluate it by feeding some test data to it.
- Lastly, you’ll get some pointers for further improvements that you can do to the model you just constructed and how you can continue your learning with TensorFlow.

To understand tensors well, it’s good to have some working knowledge of linear algebra and vector calculus. You already read in the introduction that tensors are implemented in **TensorFlow** as multidimensional data arrays, but some more introduction is maybe needed in order to completely grasp tensors and their use in machine learning.

Before you go into plane vectors, it’s a good idea to shortly revise the concept of “vectors”; Vectors are special types of matrices, which are rectangular arrays of numbers. Because vectors are ordered collections of numbers, they are often seen as column matrices: they have just one column and a certain number of rows. In other terms, you could also consider vectors as scalar magnitudes that have been given a direction.

**Remember**: an example of a scalar is “5 meters” or “60 m/sec”, while a vector is, for example, “5 meters north” or “60 m/sec East”. The difference between these two is obviously that the vector has a direction. Nevertheless, these examples that you have seen up until now might seem far off from the vectors that you might encounter when you’re working with machine learning problems. This is normal; The length of a mathematical vector is a pure number: it is absolute. The direction, on the other hand, is relative: it is measured relative to some reference direction and has units of radians or degrees. You usually assume that the direction is positive and in counterclockwise rotation from the reference direction.

Visually, of course, you represent vectors as arrows, as you can see in the picture above. This means that you can consider vectors also as arrows that have direction and length. The direction is indicated by the arrow’s head, while the length is indicated by the length of the arrow.

So what about plane vectors then?

Plane vectors are the most straightforward setup of tensors. They are much like regular vectors as you have seen above, with the sole difference that they find themselves in a vector space. To understand this better, let’s start with an example: you have a vector that is *2 X 1*. This means that the vector belongs to the set of real numbers that come paired two at a time. Or, stated differently, they are part of two-space. In such cases, you can represent vectors on the coordinate *(x,y)* plane with arrows or rays.

Working from this coordinate plane in a standard position where vectors have their endpoint at the origin *(0,0)*, you can derive the *x* coordinate by looking at the first row of the vector, while you’ll find the *y* coordinate in the second row. Of course, this standard position doesn’t always need to be maintained: vectors can move parallel to themselves in the plane without experiencing changes.

**Note** that similarly, for vectors that are of size *3 X 1*, you talk about the three-space. You can represent the vector as a three-dimensional figure with arrows pointing to positions in the vectors pace: they are drawn on the standard *x*, *y* and *z* axes.

It’s nice to have these vectors and to represent them on the coordinate plane, but in essence, you have these vectors so that you can perform operations on them and one thing that can help you in doing this is by expressing your vectors as bases or unit vectors.

Unit vectors are vectors with a magnitude of one. You’ll often recognize the unit vector by a lowercase letter with a circumflex, or “hat”. Unit vectors will come in convenient if you want to express a 2-D or 3-D vector as a sum of two or three orthogonal components, such as the x− and y−axes, or the z−axis.

And when you are talking about expressing one vector, for example, as sums of components, you’ll see that you’re talking about component vectors, which are two or more vectors whose sum is that given vector.

**Tip**: watch this video, which explains what tensors are with the help of simple household objects!

Next to plane vectors, also covectors and linear operators are two other cases that all three together have one thing in common: they are specific cases of tensors. You still remember how a vector was characterized in the previous section as scalar magnitudes that have been given a direction. A tensor, then, is the mathematical representation of a physical entity that may be characterized by magnitude and *multiple* directions.

And, just like you represent a scalar with a single number and a vector with a sequence of three numbers in a 3-dimensional space, for example, a tensor can be represented by an array of 3R numbers in a 3-dimensional space.

The “R” in this notation represents the rank of the tensor: this means that in a 3-dimensional space, a second-rank tensor can be represented by 3 to the power of 2 or 9 numbers. In an N-dimensional space, scalars will still require only one number, while vectors will require N numbers, and tensors will require N^R numbers. This explains why you often hear that scalars are tensors of rank 0: since they have no direction, you can represent them with one number.

With this in mind, it’s relatively easy to recognize scalars, vectors, and tensors and to set them apart: scalars can be represented by a single number, vectors by an ordered set of numbers, and tensors by an array of numbers.

What makes tensors so unique is the combination of components and basis vectors: basis vectors transform one way between reference frames and the components transform in just such a way as to keep the combination between components and basis vectors the same.

Installing TensorFlowNow that you know more about TensorFlow, it’s time to get started and install the library. Here, it’s good to know that TensorFlow provides APIs for Python, C++, Haskell, Java, Go, Rust, and there’s also a third-party package for R called `tensorflow`

.

In this tutorial, you will download a version of **TensorFlow** that will enable you to write the code for your **deep learning project** in **Python**. On the TensorFlow installation webpage, you’ll see some of the most common ways and latest instructions to install **TensorFlow** using `virtualenv`

, `pip`

, Docker and lastly, there are also some of the other ways of installing **TensorFlow** on your personal computer.

**Note** You can also install **TensorFlow** with Conda if you’re working on Windows. However, since the installation of **TensorFlow** is community supported, it’s best to check the official installation instructions.

Now that you have gone through the installation process, it’s time to double check that you have installed **TensorFlow** correctly by importing it into your workspace under the alias `tf`

:

import tensorflow as tf

**Note** that the alias that you used in the line of code above is sort of a convention - It’s used to ensure that you remain consistent with other developers that are using **TensorFlow** in data science projects on the one hand, and with open-source **TensorFlow** projects on the other hand.

You’ll generally write TensorFlow programs, which you run as a chunk; This is at first sight kind of contradictory when you’re working with Python. However, if you would like, you can also use TensorFlow’s Interactive Session, which you can use to work more interactively with the library. This is especially handy when you’re used to working with *IPython*.

For this tutorial, you’ll focus on the second option: this will help you to get kickstarted with deep learning in *TensorFlow*. But before you go any further into this, let’s first try out some minor stuff before you start with the heavy lifting.

First, import the `tensorflow`

library under the alias `tf`

, as you have seen in the previous section. Then initialize two variables that are actually constants. Pass an array of four numbers to the `constant()`

function.

**Note** that you could potentially also pass in an integer, but that more often than not, you’ll find yourself working with arrays. As you saw in the introduction, tensors are all about arrays! So make sure that you pass in an array :) Next, you can use `multiply()`

to multiply your two variables. Store the result in the `result`

variable. Lastly, print out the `result`

with the help of the `print()`

function.

`script.py`

# Import `tensorflow` import tensorflow as tf # Initialize two constants x1 = tf.constant([1,2,3,4]) x2 = tf.constant([5,6,7,8]) # Multiply result = tf.multiply(x1, x2) # Print the result print(result)

The result of the lines of code is an abstract tensor in the computation graph. However, contrary to what you might expect, the `result`

doesn’t actually get calculated. It just defined the model, but no process ran to calculate the result. You can see this in the print-out: there’s not really a result that you want to see (namely, 30). This means that TensorFlow has a lazy evaluation!

However, if you do want to see the result, you have to run this code in an interactive session. You can do this in a few ways, as is demonstrated in the DataCamp Light code chunks below:

`script.py`

# Import `tensorflow` import tensorflow as tf # Initialize two constants x1 = tf.constant([1,2,3,4]) x2 = tf.constant([5,6,7,8]) # Multiply result = tf.multiply(x1, x2) # Intialize the Session sess = tf.Session() # Print the result print(sess.run(result)) # Close the session sess.close()

**Note** that you can also use the following lines of code to start up an interactive Session, run the `result`

and close the Session automatically again after printing the `output`

:

`script.py`

# Import `tensorflow` import tensorflow as tf # Initialize two constants x1 = tf.constant([1,2,3,4]) x2 = tf.constant([5,6,7,8]) # Multiply result = tf.multiply(x1, x2) # Initialize Session and run `result` with tf.Session() as sess: output = sess.run(result) print(output)

In the code chunks above you have just defined a default Session, but it’s also good to know that you can pass in options as well. You can, for example, specify the `config`

argument and then use the `ConfigProto`

protocol buffer to add configuration options for your session.

For example, if you add

config=tf.ConfigProto(log_device_placement=True)

to your Session, you make sure that you log the GPU or CPU device that is assigned to an operation. You will then get information which devices are used in the session for each operation. You could use the following configuration session also, for example, when you use soft constraints for the device placement:

config=tf.ConfigProto(allow_soft_placement=True)

Now that you’ve got TensorFlow installed and imported into your workspace and you’ve gone through the basics of working with this package, it’s time to leave this aside for a moment and turn your attention to your data. Just like always, you’ll first take your time to explore and understand your data better before you start modeling your neural network.

Belgian Traffic Signs: BackgroundEven though traffic is a topic that is generally known amongst you all, it doesn’t hurt going briefly over the observations that are included in this dataset to see if you understand everything before you start. In essence, in this section, you’ll get up to speed with the domain knowledge that you need to have to go further with this tutorial.

Of course, because I’m Belgian, I’ll make sure you’ll also get some anecdotes :)

- Belgian traffic signs are usually in Dutch and French. This is good to know, but for the dataset that you’ll be working with, it’s not too important!
- There are six categories of traffic signs in Belgium: warning signs, priority signs, prohibitory signs, mandatory signs, signs related to parking and standing still on the road and, lastly, designatory signs.
- On January 1st, 2017, more than 30,000 traffic signs were removed from Belgian roads. These were all prohibitory signs relating to speed.
- Talking about removal, the overwhelming presence of traffic signs has been an ongoing discussion in Belgium (and by extension, the entire European Union).

Now that you have gathered some more background information, it’s time to download the dataset here. You should get the two zip files listed next to "BelgiumTS for Classification (cropped images), which are called "BelgiumTSC_Training" and "BelgiumTSC_Testing".

**Tip**: if you have downloaded the files or will do so after completing this tutorial, take a look at the folder structure of the data that you’ve downloaded! You’ll see that the testing, as well as the training data folders, contain 61 subfolders, which are the 62 types of traffic signs that you’ll use for classification in this tutorial. Additionally, you’ll find that the files have the file extension `.ppm`

or Portable Pixmap Format. You have downloaded images of the traffic signs!

Let’s get started with importing the data into your workspace. Let’s start with the lines of code that appear below the User-Defined Function (UDF) `load_data()`

:

- First, set your
`ROOT_PATH`

. This path is the one where you have made the directory with your training and test data. - Next, you can add the specific paths to your
`ROOT_PATH`

with the help of the`join()`

function. You store these two specific paths in`train_data_directory`

and`test_data_directory`

. - You see that after, you can call the
`load_data()`

function and pass in the`train_data_directory`

to it. - Now, the
`load_data()`

function itself starts off by gathering all the subdirectories that are present in the`train_data_directory`

; It does so with the help of list comprehension, which is quite a natural way of constructing lists - it basically says that, if you find something in the`train_data_directory`

, you’ll double check whether this is a directory, and if it is one, you’ll add it to your list.**Remember**that each subdirectory represents a label. - Next, you have to loop through the subdirectories. You first initialize two lists,
`labels`

and`images`

. Next, you gather the paths of the subdirectories and the file names of the images that are stored in these subdirectories. After, you can collect the data in the two lists with the help of the`append()`

function.

```
def load_data(data_directory):
directories = [d for d in os.listdir(data_directory)
if os.path.isdir(os.path.join(data_directory, d))]
labels = []
images = []
for d in directories:
label_directory = os.path.join(data_directory, d)
file_names = [os.path.join(label_directory, f)
for f in os.listdir(label_directory)
if f.endswith(".ppm")]
for f in file_names:
images.append(skimage.data.imread(f))
labels.append(int(d))
return images, labels
ROOT_PATH = "/your/root/path"
train_data_directory = os.path.join(ROOT_PATH, "TrafficSigns/Training")
test_data_directory = os.path.join(ROOT_PATH, "TrafficSigns/Testing")
images, labels = load_data(train_data_directory)
```

**Note** that in the above code chunk, the training and test data are located in folders named "Training" and "Testing", which are both subdirectories of another directory "TrafficSigns". On a local machine, this could look something like "/Users/Name/Downloads/TrafficSigns", with then two subfolders called "Training" and "Testing".

With your data loaded in, it’s time for some data inspection! You can start with a pretty simple analysis with the help of the `ndim`

and `size`

attributes of the `images`

array:

Note that the `images`

and `labels`

variables are lists, so you might need to use `np.array()`

to convert the variables to an array in your own workspace. This has been done for you here!

`script.py`

# Print the `images` dimensions print(images.ndim) # Print the number of `images`'s elements print(images.size) # Print the first instance of `images` images[0]

**Note** that the `images[0]`

that you printed out is, in fact, one single image that is represented by arrays in arrays! This might seem counterintuitive at first, but it’s something that you’ll get used to as you go further into working with images in machine learning or deep learning applications.

Next, you can also take a small look at the `labels`

, but you shouldn’t see too many surprises at this point:

`script.py`

# Print the `labels` dimensions print(labels.ndim) # Print the number of `labels`'s elements print(labels.size) # Count the number of labels print(len(set(labels)))

These numbers already give you some insights into how successful your import was and the exact size of your data. At first sight, everything has been executed the way you expected it to, and you see that the size of the array is considerable if you take into account that you’re dealing with arrays within arrays.

**Tip** try adding the following attributes to your arrays to get more information about the memory layout, the length of one array element in bytes and the total consumed bytes by the array’s elements with the `flags`

, `itemsize`

, and `nbytes`

attributes. You can test this out in the IPython console in the DataCamp Light chunk above!

Next, you can also take a look at the distribution of the traffic signs:

`script.py`

# Import the `pyplot` module import matplotlib.pyplot as plt # Make a histogram with 62 bins of the `labels` data plt.hist(labels, 62) # Show the plot plt.show()

Awesome job! Now let’s take a closer look at the histogram that you made!

You clearly see that not all types of traffic signs are equally represented in the dataset. This is something that you’ll deal with later when you’re manipulating the data before you start modeling your neural network.

At first sight, you see that there are labels that are more heavily present in the dataset than others: the labels 22, 32, 38, and 61 definitely jump out. At this point, it’s nice to keep this in mind, but you’ll definitely go further into this in the next section!

The previous, small analyses or checks have already given you some idea of the data that you’re working with, but when your data mostly consists of images, the step that you should take to explore your data is by visualizing it.

Let’s check out some random traffic signs:

- First, make sure that you import the
`pyplot`

module of the`matplotlib`

package under the common alias`plt`

. - Then, you’re going to make a list with 4 random numbers. These will be used to select traffic signs from the
`images`

array that you have just inspected in the previous section. In this case, you go for`300`

,`2250`

,`3650`

and`4000`

. - Next, you’ll say that for every element in the length of that list, so from 0 to 4, you’re going to create subplots without axes (so that they don’t go running with all the attention and your focus is solely on the images!). In these subplots, you’re going to show a specific image from the
`images`

array that is in accordance with the number at the index`i`

. In the first loop, you’ll pass`300`

to`images[]`

, in the second round`2250`

, and so on. Lastly, you’ll adjust the subplots so that there’s enough width in between them. - The last thing that remains is to show your plot with the help of the
`show()`

function!

There you go:

```
# Import the `pyplot` module of `matplotlib`
import matplotlib.pyplot as plt
# Determine the (random) indexes of the images that you want to see
traffic_signs = [300, 2250, 3650, 4000]
# Fill out the subplots with the random images that you defined
for i in range(len(traffic_signs)):
plt.subplot(1, 4, i+1)
plt.axis('off')
plt.imshow(images[traffic_signs[i]])
plt.subplots_adjust(wspace=0.5)
plt.show()
```

As you guessed by the 62 labels that are included in this dataset, the signs are different from each other.

But what else do you notice? Take another close look at the images below:

These four images are not of the same size!

You can obviously toy around with the numbers that are contained in the `traffic_signs`

list and follow up more thoroughly on this observation, but be as it may, this is an important observation which you will need to take into account when you start working more towards manipulating your data so that you can feed it to the neural network.

Let’s confirm the hypothesis of the differing sizes by printing the shape, the minimum and maximum values of the specific images that you have included into the subplots.

The code below heavily resembles the one that you used to create the above plot, but differs in the fact that here, you’ll alternate sizes and images instead of plotting just the images next to each other:

```
# Import `matplotlib`
import matplotlib.pyplot as plt
# Determine the (random) indexes of the images
traffic_signs = [300, 2250, 3650, 4000]
# Fill out the subplots with the random images and add shape, min and max values
for i in range(len(traffic_signs)):
plt.subplot(1, 4, i+1)
plt.axis('off')
plt.imshow(images[traffic_signs[i]])
plt.subplots_adjust(wspace=0.5)
plt.show()
print("shape: {0}, min: {1}, max: {2}".format(images[traffic_signs[i]].shape,
images[traffic_signs[i]].min(),
images[traffic_signs[i]].max()))
```

**Note** how you use the `format()`

method on the string `"shape: {0}, min: {1}, max: {2}"`

to fill out the arguments `{0}`

, `{1}`

, and `{2}`

that you defined.

Now that you have seen loose images, you might also want to revisit the histogram that you printed out in the first steps of your data exploration; You can easily do this by plotting an overview of all the 62 classes and one image that belongs to each class:

```
# Import the `pyplot` module as `plt`
import matplotlib.pyplot as plt
# Get the unique labels
unique_labels = set(labels)
# Initialize the figure
plt.figure(figsize=(15, 15))
# Set a counter
i = 1
# For each unique label,
for label in unique_labels:
# You pick the first image for each label
image = images[labels.index(label)]
# Define 64 subplots
plt.subplot(8, 8, i)
# Don't include axes
plt.axis('off')
# Add a title to each subplot
plt.title("Label {0} ({1})".format(label, labels.count(label)))
# Add 1 to the counter
i += 1
# And you plot this first image
plt.imshow(image)
# Show the plot
plt.show()
```

**Note** that even though you define 64 subplots, not all of them will show images (as there are only 62 labels!). Note also that again, you don’t include any axes to make sure that the readers’ attention doesn’t dwell far from the main topic: the traffic signs!

As you mostly guessed in the histogram above, there are considerably more traffic signs with labels 22, 32, 38, and 61. This hypothesis is now confirmed in this plot: you see that there are 375 instances with label 22, 316 instances with label 32, 285 instances with label 38 and, lastly, 282 instances with label 61.

One of the most interesting questions that you could ask yourself now is whether there’s a connection between all of these instances - maybe all of them are designatory signs?

Let’s take a closer look: you see that label 22 and 32 are prohibitory signs, but that labels 38 and 61 are designatory and a prioritory signs, respectively. This means that there’s not an immediate connection between these four, except for the fact that half of the signs that have a substantial presence in the dataset is of the prohibitory kind.

Feature ExtractionNow that you have thoroughly explored your data, it’s time to get your hands dirty! Let’s recap briefly what you discovered to make sure that you don’t forget any steps in the manipulation:

- The size of the images was unequal;
- There are 62 labels or target values (as your labels start at 0 and end at 61);
- The distribution of the traffic sign values is pretty unequal; There wasn’t really any connection between the signs that were heavily present in the dataset.

Now that you have a clear idea of what you need to improve, you can start with manipulating your data in such a way that it’s ready to be fed to the neural network or whichever model you want to feed it too. Let’s start first with extracting some features - you’ll rescale the images, and you’ll convert the images that are held in the `images`

array to grayscale. You’ll do this color conversion mainly because the color matters less in classification questions like the one you’re trying to answer now. For detection, however, the color does play a big part! So in those cases, it’s not needed to do that conversion!

To tackle the differing image sizes, you’re going to rescale the images; You can easily do this with the help of the `skimage`

or Scikit-Image library, which is a collection of algorithms for image processing.

In this case, the `transform`

module will come in handy, as it offers you a `resize()`

function; You’ll see that you make use of list comprehension (again!) to resize each image to 28 by 28 pixels. Once again, you see that the way you actually form the list: for every image that you find in the `images`

array, you’ll perform the transformation operation that you borrow from the `skimage`

library. Finally, you store the result in the `images28`

variable:

```
# Import the `transform` module from `skimage`
from skimage import transform
# Rescale the images in the `images` array
images28 = [transform.resize(image, (28, 28)) for image in images]
```

This was fairly easy wasn’t it?

**Note** that the images are now four-dimensional: if you convert `images28`

to an array and if you concatenate the attribute `shape`

to it, you’ll see that the printout tells you that `images28`

’s dimensions are `(4575, 28, 28, 3)`

. The images are 784-dimensional (because your images are 28 by 28 pixels).

You can check the result of the rescaling operation by re-using the code that you used above to plot the 4 random images with the help of the `traffic_signs`

variable. Just don’t forget to change all references to `images`

to `images28`

.

Check out the result here:

**Note** that because you rescaled, your `min`

and `max`

values have also changed; They seem to be all in the same ranges now, which is really great because then you don’t necessarily need to normalize your data!

As said in the introduction to this section of the tutorial, the color in the pictures matters less when you’re trying to answer a classification question. That’s why you’ll also go through the trouble of converting the images to grayscale.

**Note**, however, that you can also test out on your own what would happen to the final results of your model if you don’t follow through with this specific step.

Just like with the rescaling, you can again count on the Scikit-Image library to help you out; In this case, it’s the `color`

module with its `rgb2gray()`

function that you need to use to get where you need to be.

That’s going to be nice and easy!

However, don’t forget to convert the `images28`

variable back to an array, as the `rgb2gray()`

function does expect an array as an argument.

```
# Import `rgb2gray` from `skimage.color`
from skimage.color import rgb2gray
# Convert `images28` to an array
images28 = np.array(images28)
# Convert `images28` to grayscale
images28 = rgb2gray(images28)
```

Double check the result of your grayscale conversion by plotting some of the images; Here, you can again re-use and slightly adapt some of the code to show the adjusted images:

```
import matplotlib.pyplot as plt
traffic_signs = [300, 2250, 3650, 4000]
for i in range(len(traffic_signs)):
plt.subplot(1, 4, i+1)
plt.axis('off')
plt.imshow(images28[traffic_signs[i]], cmap="gray")
plt.subplots_adjust(wspace=0.5)
# Show the plot
plt.show()
```

**Note** that you indeed have to specify the color map or `cmap`

and set it to `"gray"`

to plot the images in grayscale. That is because `imshow()`

by default uses, by default, a heatmap-like color map. Read more here.

**Tip**: since you have been re-using this function quite a bit in this tutorial, you might look into how you can make it into a function :)

These two steps are very basic ones; Other operations that you could have tried out on your data include data augmentation (rotating, blurring, shifting, changing brightness,…). If you want, you could also set up an entire pipeline of data manipulation operations through which you send your images.

Deep Learning With TensorFlowNow that you have explored and manipulated your data, it’s time to construct your neural network architecture with the help of the TensorFlow package!

Just like you might have done with Keras, it’s time to build up your neural network, layer by layer.

If you haven’t done so already, import `tensorflow`

into your workspace under the conventional alias `tf`

. Then, you can initialize the Graph with the help of `Graph()`

. You use this function to define the computation. **Note** that with the Graph, you don’t compute anything, because it doesn’t hold any values. It just defines the operations that you want to be running later.

In this case, you set up a default context with the help of `as_default()`

, which returns a context manager that makes this specific Graph the default graph. You use this method if you want to create multiple graphs in the same process: with this function, you have a global default graph to which all operations will be added if you don’t explicitly create a new graph.

Next, you’re ready to add operations to your graph. As you might remember from working with Keras, you build up your model, and then in compiling it, you define a loss function, an optimizer, and a metric. This now all happens in one step when you work with TensorFlow:

- First, you define placeholders for inputs and labels because you won’t put in the “real” data yet.
**Remember**that placeholders are values that are unassigned and that will be initialized by the session when you run it. So when you finally run the session, these placeholders will get the values of your dataset that you pass in the`run()`

function! - Then, you build up the network. You first start by flattening the input with the help of the
`flatten()`

function, which will give you an array of shape`[None, 784]`

instead of the`[None, 28, 28]`

, which is the shape of your grayscale images. - After you have flattened the input, you construct a fully connected layer that generates logits of size
`[None, 62]`

. Logits is the function operates on the unscaled output of previous layers, and that uses the relative scale to understand the units is linear. - With the multi-layer perceptron built out, you can define the loss function. The choice for a loss function depends on the task that you have at hand: in this case, you make use of

sparse_softmax_cross_entropy_with_logits()

- This computes sparse softmax cross entropy between logits and labels. In other words, it measures the probability error in discrete classification tasks in which the classes are mutually exclusive. This means that each entry is in exactly one class. Here, a traffic sign can only have one single label.
**Remember**that, while regression is used to predict continuous values, classification is used to predict discrete values or classes of data points. You wrap this function with`reduce_mean()`

, which computes the mean of elements across dimensions of a tensor. - You also want to define a training optimizer; Some of the most popular optimization algorithms used are the Stochastic Gradient Descent (SGD), ADAM and RMSprop. Depending on whichever algorithm you choose, you’ll need to tune certain parameters, such as learning rate or momentum. In this case, you pick the ADAM optimizer, for which you define the learning rate at
`0.001`

. - Lastly, you initialize the operations to execute before going over to the training.

```
# Import `tensorflow`
import tensorflow as tf
# Initialize placeholders
x = tf.placeholder(dtype = tf.float32, shape = [None, 28, 28])
y = tf.placeholder(dtype = tf.int32, shape = [None])
# Flatten the input data
images_flat = tf.contrib.layers.flatten(x)
# Fully connected layer
logits = tf.contrib.layers.fully_connected(images_flat, 62, tf.nn.relu)
# Define a loss function
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y,
logits = logits))
# Define an optimizer
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
# Convert logits to label indexes
correct_pred = tf.argmax(logits, 1)
# Define an accuracy metric
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
```

You have now successfully created your first neural network with TensorFlow!

If you want, you can also print out the values of (most of) the variables to get a quick recap or checkup of what you have just coded up:

```
print("images_flat: ", images_flat)
print("logits: ", logits)
print("loss: ", loss)
print("predicted_labels: ", correct_pred)
```

**Tip**: if you see an error like “`module 'pandas' has no attribute 'computation'`

”, consider upgrading the packages `dask`

by running `pip install --upgrade dask`

in your command line. See this StackOverflow post for more information.

Now that you have built up your model layer by layer, it’s time to actually run it! To do this, you first need to initialize a session with the help of `Session()`

to which you can pass your `graph`

that you defined in the previous section. Next, you can run the session with `run()`

, to which you pass the initialized operations in the form of the `init`

variable that you also defined in the previous section.

Next, you can use this initialized session to start epochs or training loops. In this case, you pick `201`

because you want to be able to register the last `loss_value`

; In the loop, you run the session with the training optimizer and the loss (or accuracy) metric that you defined in the previous section. You also pass a `feed_dict`

argument, with which you feed data to the model. After every 10 epochs, you’ll get a log that gives you more insights into the loss or cost of the model.

As you have seen in the section on the TensorFlow basics, there is no need to close the session manually; this is done for you. However, if you want to try out a different setup, you probably will need to do so with `sess.close()`

if you have defined your session as `sess`

, like in the code chunk below:

```
tf.set_random_seed(1234)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(201):
print('EPOCH', i)
_, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: images28, y: labels})
if i % 10 == 0:
print("Loss: ", loss)
print('DONE WITH EPOCH')
```

**Remember** that you can also run the following piece of code, but that one will immediately close the session afterward, just like you saw in the introduction of this tutorial:

```
tf.set_random_seed(1234)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(201):
_, loss_value = sess.run([train_op, loss], feed_dict={x: images28, y: labels})
if i % 10 == 0:
print("Loss: ", loss)
```

**Note** that you make use of `global_variables_initializer()`

because the `initialize_all_variables()`

function is deprecated.

You have now successfully trained your model! That wasn’t too hard, was it?

You’re not entirely there yet; You still need to evaluate your neural network. In this case, you can already try to get a glimpse of well your model performs by picking 10 random images and by comparing the predicted labels with the real labels.

You can first print them out, but why not use `matplotlib`

to plot the traffic signs themselves and make a visual comparison?

```
# Import `matplotlib`
import matplotlib.pyplot as plt
import random
# Pick 10 random images
sample_indexes = random.sample(range(len(images28)), 10)
sample_images = [images28[i] for i in sample_indexes]
sample_labels = [labels[i] for i in sample_indexes]
# Run the "correct_pred" operation
predicted = sess.run([correct_pred], feed_dict={x: sample_images})[0]
# Print the real and predicted labels
print(sample_labels)
print(predicted)
# Display the predictions and the ground truth visually.
fig = plt.figure(figsize=(10, 10))
for i in range(len(sample_images)):
truth = sample_labels[i]
prediction = predicted[i]
plt.subplot(5, 2,1+i)
plt.axis('off')
color='green' if truth == prediction else 'red'
plt.text(40, 10, "Truth: {0}\nPrediction: {1}".format(truth, prediction),
fontsize=12, color=color)
plt.imshow(sample_images[i], cmap="gray")
plt.show()
```

However, only looking at random images don’t give you many insights into how well your model actually performs. That’s why you’ll load in the test data.

**Note** that you make use of the `load_data()`

function, which you defined at the start of this tutorial.

```
# Import `skimage`
from skimage import transform
# Load the test data
test_images, test_labels = load_data(test_data_directory)
# Transform the images to 28 by 28 pixels
test_images28 = [transform.resize(image, (28, 28)) for image in test_images]
# Convert to grayscale
from skimage.color import rgb2gray
test_images28 = rgb2gray(np.array(test_images28))
# Run predictions against the full test set.
predicted = sess.run([correct_pred], feed_dict={x: test_images28})[0]
# Calculate correct matches
match_count = sum([int(y == y_) for y, y_ in zip(test_labels, predicted)])
# Calculate the accuracy
accuracy = match_count / len(test_labels)
# Print the accuracy
print("Accuracy: {:.3f}".format(accuracy))
```

**Remember** to close off the session with `sess.close()`

in case you didn't use the `with tf.Session() as sess:`

to start your TensorFlow session.

Download the notebook of this tutorial here.

**Thanks for reading** ❤

If you liked this post, share it with all of your programming buddies!

Follow us on **Facebook** | **Twitter**

☞ Complete Guide to TensorFlow for Deep Learning with Python

☞ Tensorflow Bootcamp For Data Science in Python

☞ Python for Data Science and Machine Learning Bootcamp

☞ TensorFlow is dead, long live TensorFlow!

☞ Creating Chatbots Tutorial Using TensorFlow 2.0

Learn Data Science | How to Learn Data Science for Free. In this post, I have described a learning path and free online courses and tutorials that will enable you to learn data science for free.

The average cost of obtaining a masters degree at traditional bricks and mortar institutions will set you back anywhere between $30,000 and $120,000. Even online data science degree programs don’t come cheap costing a minimum of $9,000. So what do you do if you want to learn data science but can’t afford to pay this?

I trained into a career as a data scientist without taking any formal education in the subject. In this article, I am going to share with you my own personal curriculum for learning data science if you can’t or don’t want to pay thousands of dollars for more formal study.

The curriculum will consist of 3 main parts, technical skills, theory and practical experience. I will include links to free resources for every element of the learning path and will also be including some links to additional ‘low cost’ options. So if you want to spend a little money to accelerate your learning you can add these resources to the curriculum. I will include the estimated costs for each of these.

The first part of the curriculum will focus on technical skills. I recommend learning these first so that you can take a practical first approach rather than say learning the mathematical theory first. Python is by far the most widely used programming language used for data science. In the Kaggle Machine Learning and Data Science survey carried out in 2018 83% of respondents said that they used Python on a daily basis. I would, therefore, recommend focusing on this language but also spending a little time on other languages such as R.

Before you can start to use Python for data science you need a basic grasp of the fundamentals behind the language. So you will want to take a Python introductory course. There are lots of free ones out there but I like the Codeacademy ones best as they include hands-on in-browser coding throughout.

I would suggest taking the introductory course to learn Python. This covers basic syntax, functions, control flow, loops, modules and classes.

Next, you will want to get a good understanding of using Python for data analysis. There are a number of good resources for this.

To start with I suggest taking at least the free parts of the data analyst learning path on dataquest.io. Dataquest offers complete learning paths for data analyst, data scientist and data engineer. Quite a lot of the content, particularly on the data analyst path is available for free. If you do have some money to put towards learning then I strongly suggest putting it towards paying for a few months of the premium subscription. I took this course and it provided a fantastic grounding in the fundamentals of data science. It took me 6 months to complete the data scientist path. The price varies from $24.50 to $49 per month depending on whether you pay annually or not. It is better value to purchase the annual subscription if you can afford it.

If you have chosen to pay for the full data science course on Dataquest then you will have a good grasp of the fundamentals of machine learning with Python. If not then there are plenty of other free resources. I would focus to start with on scikit-learn which is by far the most commonly used Python library for machine learning.

When I was learning I was lucky enough to attend a two-day workshop run by Andreas Mueller one of the core developers of scikit-learn. He has however published all the material from this course, and others, on this Github repo. These consist of slides, course notes and notebooks that you can work through. I would definitely recommend working through this material.

Then I would suggest taking some of the tutorials in the scikit-learn documentation. After that, I would suggest building some practical machine learning applications and learning the theory behind how the models work — which I will cover a bit later on.

**SQL**

SQL is a vital skill to learn if you want to become a data scientist as one of the fundamental processes in data modelling is extracting data in the first place. This will more often than not involve running SQL queries against a database. Again if you haven’t opted to take the full Dataquest course then here are a few free resources to learn this skill.

Codeacamdemy has a free introduction to SQL course. Again this is very practical with in-browser coding all the way through. If you also want to learn about cloud-based database querying then Google Cloud BigQuery is very accessible. There is a free tier so you can try queries for free, an extensive range of public datasets to try and very good documentation.

To be a well-rounded data scientist it is a good idea to diversify a little from just Python. I would, therefore, suggest also taking an introductory course in R. Codeacademy have an introductory course on their free plan. It is probably worth noting here that similar to Dataquest Codeacademy also offers a complete data science learning plan as part of their pro account (this costs from $31.99 to $15.99 per month depending on how many months you pay for up front). I personally found the Dataquest course to be much more comprehensive but this may work out a little cheaper if you are looking to follow a learning path on a single platform.

It is a good idea to get a grasp of software engineering skills and best practices. This will help your code to be more readable and extensible both for yourself and others. Additionally, when you start to put models into production you will need to be able to write good quality well-tested code and work with tools like version control.

There are two great free resources for this. Python like you mean it covers things like the PEP8 style guide, documentation and also covers object-oriented programming really well.

The scikit-learn contribution guidelines, although written to facilitate contributions to the library, actually cover the best practices really well. This covers topics such as Github, unit testing and debugging and is all written in the context of a data science application.

For a comprehensive introduction to deep learning, I don’t think that you can get any better than the totally free and totally ad-free fast.ai. This course includes an introduction to machine learning, practical deep learning, computational linear algebra and a code-first introduction to natural language processing. All their courses have a practical first approach and I highly recommend them.

Whilst you are learning the technical elements of the curriculum you will encounter some of the theory behind the code you are implementing. I recommend that you learn the theoretical elements alongside the practical. The way that I do this is that I learn the code to be able to implement a technique, let’s take KMeans as an example, once I have something working I will then look deeper into concepts such as inertia. Again the scikit-learn documentation contains all the mathematical concepts behind the algorithms.

In this section, I will introduce the key foundational elements of theory that you should learn alongside the more practical elements.

The khan academy covers almost all the concepts I have listed below for free. You can tailor the subjects you would like to study when you sign up and you then have a nice tailored curriculum for this part of the learning path. Checking all of the boxes below will give you an overview of most elements I have listed below.

*Calculus*

Calculus is defined by Wikipedia as “the mathematical study of continuous change.” In other words calculus can find patterns between functions, for example, in the case of derivatives, it can help you to understand how a function changes over time.

Many machine learning algorithms utilise calculus to optimise the performance of models. If you have studied even a little machine learning you will probably have heard of Gradient descent. This functions by iteratively adjusting the parameter values of a model to find the optimum values to minimise the cost function. Gradient descent is a good example of how calculus is used in machine learning.

What you need to know:

*Derivatives*

- Geometric definition
- Calculating the derivative of a function
- Nonlinear functions

*Chain rule*

- Composite functions
- Composite function derivatives
- Multiple functions

*Gradients*

- Partial derivatives
- Directional derivatives
- Integrals

*Linear Algebra*

Many popular machine learning methods, including XGBOOST, use matrices to store inputs and process data. Matrices alongside vector spaces and linear equations form the mathematical branch known as Linear Algebra. In order to understand how many machine learning methods work it is essential to get a good understanding of this field.

What you need to learn:

*Vectors and spaces*

- Vectors
- Linear combinations
- Linear dependence and independence
- Vector dot and cross products

*Matrix transformations*

- Functions and linear transformations
- Matrix multiplication
- Inverse functions
- Transpose of a matrix

Here is a list of the key concepts you need to know:

*Descriptive/Summary statistics*

- How to summarise a sample of data
- Different types of distributions
- Skewness, kurtosis, central tendency (e.g. mean, median, mode)
- Measures of dependence, and relationships between variables such as correlation and covariance

*Experiment design*

- Hypothesis testing
- Sampling
- Significance tests
- Randomness
- Probability
- Confidence intervals and two-sample inference

*Machine learning*

- Inference about slope
- Linear and non-linear regression
- Classification

The third section of the curriculum is all about practice. In order to truly master the concepts above you will need to use the skills in some projects that ideally closely resemble a real-world application. By doing this you will encounter problems to work through such as missing and erroneous data and develop a deep level of expertise in the subject. In this last section, I will list some good places you can get this practical experience from for free.

“With deliberate practice, however, the goal is not just to reach your potential but to build it, to make things possible that were not possible before. This requires challenging homeostasis — getting out of your comfort zone — and forcing your brain or your body to adapt.”,

Anders Ericsson,Peak: Secrets from the New Science of Expertise

Machine learning competitions are a good place to get practice with building machine learning models. They give access to a wide range of data sets, each with a specific problem to solve and have a leaderboard. The leaderboard is a good way to benchmark how good your knowledge at developing a good model actually is and where you may need to improve further.

In addition to Kaggle, there are other platforms for machine learning competitions including Analytics Vidhya and DrivenData.

The UCI machine learning repository is a large source of publically available data sets. You can use these data sets to put together your own data projects this could include data analysis and machine learning models, you could even try building a deployed model with a web front end. It is a good idea to store your projects somewhere publically such as Github as this can create a portfolio showcasing your skills to use for future job applications.

One other option to consider is contributing to open source projects. There are many Python libraries that rely on the community to maintain them and there are often hackathons held at meetups and conferences where even beginners can join in. Attending one of these events would certainly give you some practical experience and an environment where you can learn from others whilst giving something back at the same time. Numfocus is a good example of a project like this.

In this post, I have described a learning path and free online courses and tutorials that will enable you to learn data science for free. Showcasing what you are able to do in the form of a portfolio is a great tool for future job applications in lieu of formal qualifications and certificates. I really believe that education should be accessible to everyone and, certainly, for data science at least, the internet provides that opportunity. In addition to the resources listed here, I have previously published a recommended reading list for learning data science available here. These are also all freely available online and are a great way to complement the more practical resources covered above.

Thanks for reading!

This full course introduces the concept of client-side artificial neural networks. We will learn how to deploy and run models along with full deep learning applications in the browser! To implement this cool capability, we’ll be using TensorFlow.js (TFJS), TensorFlow’s JavaScript library.

By the end of this video tutorial, you will have built and deployed a web application that runs a neural network in the browser to classify images! To get there, we'll learn about client-server deep learning architectures, converting Keras models to TFJS models, serving models with Node.js, tensor operations, and more!

⭐️Course Sections⭐️

⌨️ 0:00 - Intro to deep learning with client-side neural networks

⌨️ 6:06 - Convert Keras model to Layers API format

⌨️ 11:16 - Serve deep learning models with Node.js and Express

⌨️ 19:22 - Building UI for neural network web app

⌨️ 27:08 - Loading model into a neural network web app

⌨️ 36:55 - Explore tensor operations with VGG16 preprocessing

⌨️ 45:16 - Examining tensors with the debugger

⌨️ 1:00:37 - Broadcasting with tensors

⌨️ 1:11:30 - Running MobileNet in the browser

In this TensorFlow tutorial for professionals and enthusiasts who are interested in applying Deep Learning Algorithm using TensorFlow to solve various problems.

In this TensorFlow tutorial for professionals and enthusiasts who are interested in applying Deep Learning Algorithm using TensorFlow to solve various problems.

TensorFlow is an open source deep learning library that is based on the concept of data flow graphs for building models. It allows you to create large-scale neural networks with many layers. Learning the use of this library is also a fundamental part of the AI & Deep Learning course curriculum. Following are the topics that will be discussed in this TensorFlow tutorial:

**What is TensorFlow** TensorFlow Code Basics**TensorFlow UseCase **##

In this **TensorFlow tutorial**, before talking about TensorFlow, let us first understand *what are tensors*. **Tensors **are nothing but a de facto for representing the data in deep learning.

As shown in the image above, tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set. In fact, the name “**TensorFlow**” has been derived from the operations which neural networks perform on tensors. It’s literally a flow of tensors. Since, you have understood what are tensors, let us move ahead in this **TensorFlow **tutorial and understand – *what is TensorFlow?*

**TensorFlow **is a library based on Python that provides different types of functionality for implementing **Deep Learning Models**. As discussed earlier, the term **TensorFlow** is made up of two terms – Tensor & Flow:

In **TensorFlow**, the term tensor refers to the representation of data as multi-dimensional array whereas the term flow refers to the series of operations that one performs on tensors as shown in the above image.

Now we have covered enough background about **TensorFlow**.

Next up, in this TensorFlow tutorial we will be discussing about TensorFlow code-basics.

TensorFlow Tutorial: Code BasicsBasically, the overall process of writing a **TensorFlow program** involves two steps:

- Building a Computational Graph
- Running a Computational Graph

Let me explain you the above two steps one by one:

So, *what is a computational graph?* Well, a computational graph is a series of TensorFlow operations arranged as nodes in the graph. Each nodes take 0 or more tensors as input and produces a tensor as output. Let me give you an example of a simple computational graph which consists of three nodes – * a*,

**What is TensorFlow** TensorFlow Code Basics**TensorFlow UseCase **

Basically, one can think of a computational graph as an alternative way of conceptualizing mathematical calculations that takes place in a TensorFlow program. The operations assigned to different nodes of a Computational Graph can be performed in parallel, thus, providing a better performance in terms of computations.

Here we just describe the computation, it doesn’t compute anything, it does not hold any values, it just defines the operations specified in your code.

Let us take the previous example of computational graph and understand how to execute it. Following is the code from previous example:

```
import tensorflow as tf
# Build a graph
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
```

Now, in order to get the output of node c, we need to run the computational graph within a **session**. Session places the graph operations onto Devices, such as CPUs or GPUs, and provides methods to execute them.

A session encapsulates the control and state of the *TensorFlow *runtime i.e. it stores the information about the order in which all the operations will be performed and passes the result of already computed operation to the next operation in the pipeline. Let me show you how to run the above computational graph within a session (Explanation of each line of code has been added as a comment):

```
# Create the session object
sess = tf.Session()
#Run the graph within a session and store the output to a variable
output_c = sess.run(c)
#Print the output of node c
print(output_c)
#Close the session to free up some resources
sess.close()
Output:
30
```

So, this was all about session and running a computational graph within it. Now, let us talk about variables and placeholders that we will be using extensively while building deep learning model using *TensorFlow*.

In *TensorFlow*, constants, placeholders and variables are used to represent different parameters of a deep learning model. Since, I have already discussed constants earlier, I will start with placeholders.

A *TensorFlow* constant allows you to store a value but, what if, you want your nodes to take inputs on the run? For this kind of functionality, placeholders are used which allows your graph to take external inputs as parameters. Basically, a placeholder is a promise to provide a value later or during runtime. Let me give you an example to make things simpler:

```
import tensorflow as tf
# Creating placeholders
a = tf. placeholder(tf.float32)
b = tf. placeholder(tf.float32)
# Assigning multiplication operation w.r.t. a & b to node mul
mul = a*b
# Create session object
sess = tf.Session()
# Executing mul by passing the values [1, 3] [2, 4] for a and b respectively
output = sess.run(mul, {a: [1,3], b: [2, 4]})
print('Multiplying a b:', output)
Output:
[2. 12.]
```

**What is TensorFlow** TensorFlow Code Basics**TensorFlow UseCase **

Now, let us move ahead and understand –

In deep learning, placeholders are used to take arbitrary inputs in your model or graph. Apart from taking input, you also need to modify the graph such that it can produce new outputs w.r.t. same inputs. For this you will be using variables. In a nutshell, a variable allows you to add such parameters or node to the graph that are trainable i.e. the value can be modified over the period of a time. Variables are defined by providing their initial value and type as shown below:

```
var = tf.Variable( [0.4], dtype = tf.float32 )
```

**Note: **

**What is TensorFlow** TensorFlow Code Basics**TensorFlow UseCase **

Constants are initialized when you call

```
init = tf.global_variables_initializer()
sess.run(init)
```

Always remember that a variable must be initialized before a graph is used for the first time.

**Note:** *TensorFlow variables are in-memory buffers that contain tensors, but unlike normal tensors that are only instantiated when a graph is run and are immediately deleted afterwards, variables survive across multiple executions of a graph.*

Now that we have covered enough basics of *TensorFlow*, let us go ahead and understand how to implement a linear regression model using *TensorFlow*.

Linear Regression Model is used for predicting the unknown value of a variable (Dependent Variable) from the known value of another variables (Independent Variable) using linear regression equation as shown below:

Therefore, for creating a linear model, you need:

- Building a Computational Graph
- Running a Computational Graph

So, let us begin building linear model using TensorFlow:

Copy the code by clicking the button given below:

```
# Creating variable for parameter slope (W) with initial value as 0.4
W = tf.Variable([.4], tf.float32)
#Creating variable for parameter bias (b) with initial value as -0.4
b = tf.Variable([-0.4], tf.float32)
# Creating placeholders for providing input or independent variable, denoted by x
x = tf.placeholder(tf.float32)
# Equation of Linear Regression
linear_model = W * x + b
# Initializing all the variables
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
# Running regression model to calculate the output w.r.t. to provided x values
print(sess.run(linear_model {x: [1, 2, 3, 4]}))
```

**Output:**

```
[ 0. 0.40000001 0.80000007 1.20000005]
```

The above stated code just represents the basic idea behind the implementation of regression model i.e. how you follow the equation of regression line so as to get output w.r.t. a set of input values. But, there are two more things left to be added in this model to make it a complete regression model:

**What is TensorFlow** TensorFlow Code Basics**TensorFlow UseCase **

Now let us understand how can I incorporate the above stated functionalities into my code for regression model.

A loss function measures how far apart the current output of the model is from that of the desired or target output. I’ll use a most commonly used loss function for my linear regression model called as Sum of Squared Error or SSE. SSE calculated w.r.t. model output (represent by linear_model) and desired or target output (y) as:

```
y = tf.placeholder(tf.float32)
error = linear_model - y
squared_errors = tf.square(error)
loss = tf.reduce_sum(squared_errors)
print(sess.run(loss, {x:[1,2,3,4], y:[2, 4, 6, 8]})
```

```
Output:
90.24
```

As you can see, we are getting a high loss value. Therefore, we need to adjust our weights (W) and bias (b) so as to reduce the error that we are receiving.

TensorFlow provides **optimizers** that slowly change each variable in order to minimize the loss function or error. The simplest optimizer is **gradient descent**. It modifies each variable according to the magnitude of the derivative of loss with respect to that variable.

```
#Creating an instance of gradient descent optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
for i in range(1000):
sess.run(train, {x:[1, 2, 3, 4], y:[2, 4, 6, 8]})
print(sess.run([W, b]))
```

```
Output:
[array([ 1.99999964], dtype=float32), array([ 9.86305167e-07], dtype=float32)]
```

So, this is how you create a linear model using TensorFlow and train it to get the desired output.