Could someone please explain to me what is happening here? I understand what is happening here: <a href="https://docs.scipy.org/doc/numpy-1.15.0/user/basics.indexing.html#index-arrays" target="_blank">https://docs.scipy.org/doc/numpy-1.15.0/user/basics.indexing.html#index-arrays</a>, but do not understand this piece of code.

Could someone please explain to me what is happening here? I understand what is happening here: https://docs.scipy.org/doc/numpy-1.15.0/user/basics.indexing.html#index-arrays, but do not understand this piece of code.

import numpy as np y = np.zeros((3,3)) y = y.astype(np.int16) y[1,1] = 1 x = np.ones((3,3)) t = (1-y).astype(np.int16) print(t) print(x[t]) x[(1-y).astype(np.int16)] = 0 print(x)

output:

[[1 1 1] [1 0 1] [1 1 1]][[[1. 1. 1.]

[1. 1. 1.]

[1. 1. 1.]][[1. 1. 1.]

[1. 1. 1.]

[1. 1. 1.]][[1. 1. 1.]

[1. 1. 1.]

[1. 1. 1.]]][[0. 0. 0.]

[0. 0. 0.]

[1. 1. 1.]]

Understanding neural networks using Python and Numpy by coding

If you are a **junior data scientist** who sort of understands how neural nets work, or a **machine learning enthusiast** who only knows a little about **deep learning**, this is the article that you cannot miss. Here is **how you can build a neural net from scratch using NumPy** in ** 9 steps **— from data pre-processing to back-propagation — a must-do practice.

Basic understanding of **machine learning**, **artificial neural network**, **Python syntax**, and programming logic is preferred (but not necessary as you can learn on the go).

*Codes are available on **Github**.*

Step one. Import NumPy. Seriously.

import numpy as np np.random.seed(42) # for reproducibility2. Data Generation

**Deep learning** is data-hungry. Although there are many clean datasets available online, we will generate our own for simplicity — for inputs **a** and **b**, we have outputs **a+b**, **a-b**, and **|a-b|**. 10,000 datum points are generated.

X_num_row, X_num_col = [2, 10000] # Row is no. of feature, col is no. of datum points X_raw = np.random.rand(X_num_row,X_num_col) * 100 y_raw = np.concatenate(([(X_raw[0,:] + X_raw[1,:])], [(X_raw[0,:] - X_raw[1,:])], np.abs([(X_raw[0,:] - X_raw[1,:])]))) # for input a and b, output is a+b; a-b and |a-b| y_num_row, y_num_col = y_raw.shape3. Train-test Splitting

Our dataset is split into training (70%) and testing (30%) set. Only training set is leveraged for tuning neural networks. Testing set is used only for performance evaluation when the training is complete.

train_ratio = 0.7 num_train_datum = int(train_ratio*X_num_col) X_raw_train = X_raw[:,0:num_train_datum] X_raw_test = X_raw[:,num_train_datum:] y_raw_train = y_raw[:,0:num_train_datum] y_raw_test = y_raw[:,num_train_datum:]4. Data Standardization

Data in the training set is standardized so that the distribution for each standardized feature is zero-mean and unit-variance. The scalers generated from the abovementioned procedure can then be applied to the testing set.

class scaler: def __init__(self, mean, std): self.mean = mean self.std = stddef get_scaler(row):

mean = np.mean(row)

std = np.std(row)

return scaler(mean, std)def standardize(data, scaler):

return (data - scaler.mean) / scaler.stddef unstandardize(data, scaler):

Construct scalers from training set

return (data * scaler.std) + scaler.meanX_scalers = [get_scaler(X_raw_train[row,:]) for row in range(X_num_row)]

X_train = np.array([standardize(X_raw_train[row,:], X_scalers[row]) for row in range(X_num_row)])y_scalers = [get_scaler(y_raw_train[row,:]) for row in range(y_num_row)]

Apply those scalers to testing set

y_train = np.array([standardize(y_raw_train[row,:], y_scalers[row]) for row in range(y_num_row)])X_test = np.array([standardize(X_raw_test[row,:], X_scalers[row]) for row in range(X_num_row)])

Check if data has been standardized

y_test = np.array([standardize(y_raw_test[row,:], y_scalers[row]) for row in range(y_num_row)])print([X_train[row,:].mean() for row in range(X_num_row)]) # should be close to zero

print([X_train[row,:].std() for row in range(X_num_row)]) # should be close to oneprint([y_train[row,:].mean() for row in range(y_num_row)]) # should be close to zero

print([y_train[row,:].std() for row in range(y_num_row)]) # should be close to one

The scaler therefore does not contain any information from our testing set. We do not want our neural net to gain any information regarding testing set before network tuning.

We have now completed the data pre-processing procedures in ** 4 steps**.

Photo by freestocks.org on Unsplash

We objectify a ‘layer’ using class in Python. Every layer (except the input layer) has a weight matrix **W**, a bias vector ** b**, and an activation function. Each layer is appended to a list called

class layer:

definit(self, layer_index, is_output, input_dim, output_dim, activation):

self.layer_index = layer_index # zero indicates input layer

self.is_output = is_output # true indicates output layer, false otherwise

self.input_dim = input_dim

self.output_dim = output_dim

self.activation = activationChange layers_dim to configure your own neural net!`# the multiplication constant is sorta arbitrary if layer_index != 0: self.W = np.random.randn(output_dim, input_dim) * np.sqrt(2/input_dim) self.b = np.random.randn(output_dim, 1) * np.sqrt(2/input_dim)`

layers_dim = [X_num_row, 4, 4, y_num_row] # input layer --- hidden layers --- output layers

Construct the net layer by layer

neural_net = []for layer_index in range(len(layers_dim)):

Simple check on overfitting

if layer_index == 0: # if input layer

neural_net.append(layer(layer_index, False, 0, layers_dim[layer_index], 'irrelevant'))

elif layer_index+1 == len(layers_dim): # if output layer

neural_net.append(layer(layer_index, True, layers_dim[layer_index-1], layers_dim[layer_index], activation='linear'))

else:

neural_net.append(layer(layer_index, False, layers_dim[layer_index-1], layers_dim[layer_index], activation='relu'))pred_n_param = sum([(layers_dim[layer_index]+1)*layers_dim[layer_index+1] for layer_index in range(len(layers_dim)-1)])

act_n_param = sum([neural_net[layer_index].W.size + neural_net[layer_index].b.size for layer_index in range(1,len(layers_dim))])

print(f'Predicted number of hyperparameters: {pred_n_param}')

print(f'Actual number of hyperparameters: {act_n_param}')

print(f'Number of data: {X_num_col}')if act_n_param >= X_num_col:

raise Exception('It will overfit.')

Finally, we do a sanity check on the number of hyperparameters using the following formula, and by counting. The number of datums available should exceed the number of hyperparameters, otherwise it will definitely overfit.

N^l is number of hyperparameters at l-th layer, L is number of layers (excluding input layer)

6. Forward PropagationWe define a function for forward propagation given a certain set of weights and biases. The connection between layers is defined in matrix form as:

σ is element-wise activation function, superscript T means transpose of a matrix

Activation functions are defined one by one. ReLU is implemented as ** a → max(a,0)**, whereas sigmoid function should return

def activation(input_, act_func):

if act_func == 'relu':

return np.maximum(input_, np.zeros(input_.shape))

elif act_func == 'linear':

return input_

else:

raise Exception('Activation function is not defined.')def forward_prop(input_vec, layers_dim=layers_dim, neural_net=neural_net):

neural_net[0].A = input_vec # Define A in input layer for for-loop convenience

for layer_index in range(1,len(layers_dim)): # W,b,Z,A are undefined in input layer

neural_net[layer_index].Z = np.add(np.dot(neural_net[layer_index].W, neural_net[layer_index-1].A), neural_net[layer_index].b)

neural_net[layer_index].A = activation(neural_net[layer_index].Z, neural_net[layer_index].activation)

return neural_net[layer_index].A

Photo by Holger Link on Unsplash

This is the most tricky part where many of us simply do not understand. Once we have defined a loss metric *e* for evaluating performance, we would like to know how the loss metric change when we perturb each weight or bias.

We want to know how sensitive each weight and bias is with respect to the loss metric.

This is represented by partial derivatives **∂e/∂W** (denoted dW in code) and **∂e/∂b** (denoted db in code) respectively, and can be calculated analytically.

⊙ represents element-wise multiplication

These back-propagation equations assume only one datum *y* is compared. The gradient update process would be very noisy as the performance of each iteration is subject to one datum point only. Multiple datums can be used to reduce the noise where **∂W(y 1, y2, …) **would be the mean of

def get_loss(y, y_hat, metric='mse'):

if metric == 'mse':

individual_loss = 0.5 * (y_hat - y) ** 2

return np.mean([np.linalg.norm(individual_loss[:,col], 2) for col in range(individual_loss.shape[1])])

else:

raise Exception('Loss metric is not defined.')def get_dZ_from_loss(y, y_hat, metric):

if metric == 'mse':

return y_hat - y

else:

raise Exception('Loss metric is not defined.')def get_dactivation(A, act_func):

if act_func == 'relu':

return np.maximum(np.sign(A), np.zeros(A.shape)) # 1 if backward input >0, 0 otherwise; then diaganolize

elif act_func == 'linear':

return np.ones(A.shape)

else:

raise Exception('Activation function is not defined.')def backward_prop(y, y_hat, metric='mse', layers_dim=layers_dim, neural_net=neural_net, num_train_datum=num_train_datum):

for layer_index in range(len(layers_dim)-1,0,-1):

if layer_index+1 == len(layers_dim): # if output layer

dZ = get_dZ_from_loss(y, y_hat, metric)

else:

dZ = np.multiply(np.dot(neural_net[layer_index+1].W.T, dZ),

get_dactivation(neural_net[layer_index].A, neural_net[layer_index].activation))

dW = np.dot(dZ, neural_net[layer_index-1].A.T) / num_train_datum

db = np.sum(dZ, axis=1, keepdims=True) / num_train_datum`neural_net[layer_index].dW = dW neural_net[layer_index].db = db`

We now have every building block for training a neural network.

Once we know the sensitivities of weights and biases, we try to ** minimize** (hence the minus sign) the loss metric iteratively by gradient descent using the following update rule:

W = W - learning_rate * ∂W

b = b - learning_rate * ∂b

Photo by Rostyslav Savchyn on Unsplash

learning_rate = 0.01

max_epoch = 1000000for epoch in range(1,max_epoch+1):

y_hat_train = forward_prop(X_train) # update y_hat

backward_prop(y_train, y_hat_train) # update (dW,db)`for layer_index in range(1,len(layers_dim)): # update (W,b) neural_net[layer_index].W = neural_net[layer_index].W - learning_rate * neural_net[layer_index].dW neural_net[layer_index].b = neural_net[layer_index].b - learning_rate * neural_net[layer_index].db if epoch % 100000 == 0: print(f'{get_loss(y_train, y_hat_train):.4f}')`

Training loss should be going down as it iterates

9. TestingThe model generalizes well if the testing loss is not much higher than the training loss. We also make some test cases to see how the model performs.

# test lossprint(get_loss(y_test, forward_prop(X_test)))

def predict(X_raw_any):

X_any = np.array([standardize(X_raw_any[row,:], X_scalers[row]) for row in range(X_num_row)])

y_hat = forward_prop(X_any)

y_hat_any = np.array([unstandardize(y_hat[row,:], y_scalers[row]) for row in range(y_num_row)])

return y_hat_anypredict(np.array([[30,70],[70,30],[3,5],[888,122]]).T)

This is how you can build a neural net from scratch using NumPy in ** 9 steps**.

My implementation by no means is the most efficient way to build and train a neural net. There is so much room for improvement but that is a story for another day. Codes are available on Github. Happy coding!

**Thanks for reading** ❤

If you liked this post, share it with all of your programming buddies!

Follow us on **Facebook** | **Twitter**

☞ The Data Science Course 2019: Complete Data Science Bootcamp

☞ Machine Learning A-Z™: Hands-On Python & R In Data Science

☞ Tableau 10 A-Z: Hands-On Tableau Training For Data Science!

☞ R Programming A-Z™: R For Data Science With Real Exercises!

☞ Machine Learning, Data Science and Deep Learning with Python

All the basics to start using the Python library NumPy. In this course I'll cover the basics of using number and have several interactive course videos that will challenge you to learn how to use NumPy.

**Learn NumPy Fundamentals - Python Library for Data Science**

**We'll cover:**

- Why use NumPy?
- NumPy Arrays
- Array Math
- Array Indexing
- Advanced Indexing
- Broadcasting
- & much more!

**What you'll learn**

- Python
- NumPy

Learn the basics of the NumPy library in this tutorial for beginners. It provides background information on how NumPy works and how it compares to Python's Built-in lists. This video goes through how to write code with NumPy. It starts with the basics of creating arrays and then gets into more advanced stuff. The video covers creating arrays, indexing, math, statistics, reshaping, and more.

Code: https://github.com/KeithGalli/NumPy

**Course Contents**

⌨️ (01:15) What is NumPy

⌨️ (01:35) NumPy vs Lists (speed, functionality)

⌨️ (09:17) Applications of NumPy

⌨️ (11:08) The Basics (creating arrays, shape, size, data type)

⌨️ (16:08) Accessing/Changing Specific Elements, Rows, Columns, etc (slicing)

⌨️ (23:14) Initializing Different Arrays (1s, 0s, full, random, etc...)

⌨️ (31:34) Problem #1 (How do you initialize this array?)

⌨️ (33:42) Be careful when copying variables!

⌨️ (35:45) Basic Mathematics (arithmetic, trigonometry, etc.)

⌨️ (38:20) Linear Algebra

⌨️ (42:19) Statistics

⌨️ (43:57) Reorganizing Arrays (reshape, vstack, hstack)

⌨️ (47:29) Load data in from a file

⌨️ (50:20) Advanced Indexing and Boolean Masking

⌨️ (55:59) Problem #2 (How do you index these values?)

**Thanks for reading** ❤

If you liked this post, share it with all of your programming buddies!

Follow us on **Facebook** | **Twitter**

☞ Machine Learning A-Z™: Hands-On Python & R In Data Science

☞ Python for Data Science and Machine Learning Bootcamp

☞ Machine Learning, Data Science and Deep Learning with Python

☞ Deep Learning A-Z™: Hands-On Artificial Neural Networks

☞ Artificial Intelligence A-Z™: Learn How To Build An AI

☞ A Complete Machine Learning Project Walk-Through in Python

☞ Machine Learning: how to go from Zero to Hero

☞ Top 18 Machine Learning Platforms For Developers

☞ 10 Amazing Articles On Python Programming And Machine Learning

☞ 100+ Basic Machine Learning Interview Questions and Answers

☞ NumPy Tutorial for Beginners

☞ Learn NumPy Arrays With Examples