I am trying to calculate the spectrogram out of <code>.wav</code> files using Python. In an effort to do so, I am following the instructions that could be found <a href="https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html" target="_blank">in here</a>. I am firstly read <code>.wav</code> files using librosa library. The code found in the link works properly. That code is:

I am trying to calculate the spectrogram out of `.wav`

files using Python. In an effort to do so, I am following the instructions that could be found in here. I am firstly read `.wav`

files using librosa library. The code found in the link works properly. That code is:

sig, rate = librosa.load(file, sr = None) sig = buf_to_int(sig, n_bytes=2) spectrogram = sig2spec(rate, sig)

And the function sig2spec:

def sig2spec(signal, sample_rate): Read the file. sample_rate, signal = scipy.io.wavfile.read(filename) signal = signal[0:int(1.5 * sample_rate)] # Keep the first 3.5 seconds plt.plot(signal) plt.show() Pre-emphasis step: Amplification of the high frequencies (HF) (1) balance the frequency spectrum since HF usually have smaller magnitudes compared to LF (2) avoid numerical problems during the Fourier transform operation and (3) may also improve the Signal-to-Noise Ratio (SNR).pre_emphasis = 0.97

plt.plot(emphasized_signal) plt.show() Consequently, we split the signal into short time windows. We can safely make the assumption that an audio signal is stationary over a small short period of time. Those windows size are balanced from the parameter called frame_size, while the overlap between consecutive windows is controlled from the variable frame_stride.

emphasized_signal = numpy.append(signal[0], signal[1:] - pre_emphasis * signal[:-1])frame_size = 0.025

Make sure that we have at least 1 frame

frame_stride = 0.01

frame_length, frame_step = frame_size * sample_rate, frame_stride * sample_rate # Convert from seconds to samples

signal_length = len(emphasized_signal)

frame_length = int(round(frame_length))

frame_step = int(round(frame_step))

num_frames = int(numpy.ceil(float(numpy.abs(signal_length - frame_length)) / frame_step))pad_signal_length = num_frames * frame_step + frame_length

Pad Signal to make sure that all frames have equal number of samples without truncating any samples from the original signal

z = numpy.zeros((pad_signal_length - signal_length))

pad_signal = numpy.append(emphasized_signal, z)indices = numpy.tile(numpy.arange(0, frame_length), (num_frames, 1))

+ numpy.tile(numpy.arange(0, num_frames * frame_step, frame_step), (frame_length, 1)).Tframes = pad_signal[indices.astype(numpy.int32, copy=False)]

Apply hamming windows. The rationale behind that is the assumption made by the FFT that the data is infinite and to reduce spectral leakage.frames *= numpy.hamming(frame_length)

Fourier-Transform and Power Spectrumnfft = 2048

Transform the FFT to MEL scale

mag_frames = numpy.absolute(numpy.fft.rfft(frames, nfft)) # Magnitude of the FFT

pow_frames = ((1.0 / nfft) * (mag_frames ** 2)) # Power Spectrumnfilt = 40

low_freq_mel = 0

high_freq_mel = (2595 * numpy.log10(1 + (sample_rate / 2) / 700)) # Convert Hz to Mel

mel_points = numpy.linspace(low_freq_mel, high_freq_mel, nfilt + 2) # Equally spaced in Mel scale

hz_points = (700 * (10 ** (mel_points / 2595) - 1)) # Convert Mel to Hz

bin = numpy.floor((nfft + 1) * hz_points / sample_rate)fbank = numpy.zeros((nfilt, int(numpy.floor(nfft / 2 + 1))))

for m in range(1, nfilt + 1):

f_m_minus = int(bin[m - 1]) # left

f_m = int(bin[m]) # center

f_m_plus = int(bin[m + 1]) # right`for k in range(f_m_minus, f_m): fbank[m - 1, k] = (k - bin[m - 1]) / (bin[m] - bin[m - 1]) for k in range(f_m, f_m_plus): fbank[m - 1, k] = (bin[m + 1] - k) / (bin[m + 1] - bin[m])`

filter_banks = numpy.dot(pow_frames, fbank.T)

filter_banks = numpy.where(filter_banks == 0, numpy.finfo(float).eps, filter_banks) # Numerical Stability

filter_banks = 20 * numpy.log10(filter_banks) # dBreturn (filter_banks/ np.amax(filter_banks))*255

I can produce images that look like:

However, in some cases my spectrogram looks like:

Something really weird is happening since at the beginning of the signal there are some blue stripes in the images that I do not understand if they really mean something or there is an error when calculating the spectrogram. I guess the issue is related to normalization, but I am not sure what is exactly.

**EDIT:** I tried to use the recommended librosa from the library:

sig, rate = librosa.load("audio.wav", sr = None)

spectrogram = librosa.feature.melspectrogram(y=sig, sr=rate)

spec_shape = spectrogram.shape

fig = plt.figure(figsize=(spec_shape), dpi=5)

lidis.specshow(spectrogram.T, cmap=cm.jet)

plt.tight_layout()

plt.savefig("spec.jpg")

The spec now is almost everywhere dark blue:

Python GUI Programming Projects using Tkinter and Python 3

Description

Learn Hands-On Python Programming By Creating Projects, GUIs and Graphics

Python is a dynamic modern object -oriented programming language

It is easy to learn and can be used to do a lot of things both big and small

Python is what is referred to as a high level language

Python is used in the industry for things like embedded software, web development, desktop applications, and even mobile apps!

SQL-Lite allows your applications to become even more powerful by storing, retrieving, and filtering through large data sets easily

If you want to learn to code, Python GUIs are the best way to start!

I designed this programming course to be easily understood by absolute beginners and young people. We start with basic Python programming concepts. Reinforce the same by developing Project and GUIs.

Why Python?

The Python coding language integrates well with other platforms – and runs on virtually all modern devices. If you’re new to coding, you can easily learn the basics in this fast and powerful coding environment. If you have experience with other computer languages, you’ll find Python simple and straightforward. This OSI-approved open-source language allows free use and distribution – even commercial distribution.

When and how do I start a career as a Python programmer?

In an independent third party survey, it has been revealed that the Python programming language is currently the most popular language for data scientists worldwide. This claim is substantiated by the Institute of Electrical and Electronic Engineers, which tracks programming languages by popularity. According to them, Python is the second most popular programming language this year for development on the web after Java.

Python Job Profiles

Software Engineer

Research Analyst

Data Analyst

Data Scientist

Software Developer

Python Salary

The median total pay for Python jobs in California, United States is $74,410, for a professional with one year of experience

Below are graphs depicting average Python salary by city

The first chart depicts average salary for a Python professional with one year of experience and the second chart depicts the average salaries by years of experience

Who Uses Python?

This course gives you a solid set of skills in one of today’s top programming languages. Today’s biggest companies (and smartest startups) use Python, including Google, Facebook, Instagram, Amazon, IBM, and NASA. Python is increasingly being used for scientific computations and data analysis

Take this course today and learn the skills you need to rub shoulders with today’s tech industry giants. Have fun, create and control intriguing and interactive Python GUIs, and enjoy a bright future! Best of Luck

Who is the target audience?

Anyone who wants to learn to code

For Complete Programming Beginners

For People New to Python

This course was designed for students with little to no programming experience

People interested in building Projects

Anyone looking to start with Python GUI development

Basic knowledge

Access to a computer

Download Python (FREE)

Should have an interest in programming

Interest in learning Python programming

Install Python 3.6 on your computer

What will you learn

Build Python Graphical User Interfaces(GUI) with Tkinter

Be able to use the in-built Python modules for their own projects

Use programming fundamentals to build a calculator

Use advanced Python concepts to code

Build Your GUI in Python programming

Use programming fundamentals to build a Project

Signup Login & Registration Programs

Quizzes

Assignments

Job Interview Preparation Questions

& Much More

Guide to Python Programming Language

Description

The course will lead you from beginning level to advance in Python Programming Language. You do not need any prior knowledge on Python or any programming language or even programming to join the course and become an expert on the topic.

The course is begin continuously developing by adding lectures regularly.

Please see the Promo and free sample video to get to know more.

Hope you will enjoy it.

Basic knowledge

An Enthusiast Mind

A Computer

Basic Knowledge To Use Computer

Internet Connection

What will you learn

Will Be Expert On Python Programming Language

Build Application On Python Programming Language

Understanding neural networks using Python and Numpy by coding

If you are a **junior data scientist** who sort of understands how neural nets work, or a **machine learning enthusiast** who only knows a little about **deep learning**, this is the article that you cannot miss. Here is **how you can build a neural net from scratch using NumPy** in ** 9 steps **— from data pre-processing to back-propagation — a must-do practice.

Basic understanding of **machine learning**, **artificial neural network**, **Python syntax**, and programming logic is preferred (but not necessary as you can learn on the go).

*Codes are available on **Github**.*

Step one. Import NumPy. Seriously.

import numpy as np np.random.seed(42) # for reproducibility2. Data Generation

**Deep learning** is data-hungry. Although there are many clean datasets available online, we will generate our own for simplicity — for inputs **a** and **b**, we have outputs **a+b**, **a-b**, and **|a-b|**. 10,000 datum points are generated.

X_num_row, X_num_col = [2, 10000] # Row is no. of feature, col is no. of datum points X_raw = np.random.rand(X_num_row,X_num_col) * 100 y_raw = np.concatenate(([(X_raw[0,:] + X_raw[1,:])], [(X_raw[0,:] - X_raw[1,:])], np.abs([(X_raw[0,:] - X_raw[1,:])]))) # for input a and b, output is a+b; a-b and |a-b| y_num_row, y_num_col = y_raw.shape3. Train-test Splitting

Our dataset is split into training (70%) and testing (30%) set. Only training set is leveraged for tuning neural networks. Testing set is used only for performance evaluation when the training is complete.

train_ratio = 0.7 num_train_datum = int(train_ratio*X_num_col) X_raw_train = X_raw[:,0:num_train_datum] X_raw_test = X_raw[:,num_train_datum:] y_raw_train = y_raw[:,0:num_train_datum] y_raw_test = y_raw[:,num_train_datum:]4. Data Standardization

Data in the training set is standardized so that the distribution for each standardized feature is zero-mean and unit-variance. The scalers generated from the abovementioned procedure can then be applied to the testing set.

class scaler: def __init__(self, mean, std): self.mean = mean self.std = stddef get_scaler(row):

mean = np.mean(row)

std = np.std(row)

return scaler(mean, std)def standardize(data, scaler):

return (data - scaler.mean) / scaler.stddef unstandardize(data, scaler):

Construct scalers from training set

return (data * scaler.std) + scaler.meanX_scalers = [get_scaler(X_raw_train[row,:]) for row in range(X_num_row)]

X_train = np.array([standardize(X_raw_train[row,:], X_scalers[row]) for row in range(X_num_row)])y_scalers = [get_scaler(y_raw_train[row,:]) for row in range(y_num_row)]

Apply those scalers to testing set

y_train = np.array([standardize(y_raw_train[row,:], y_scalers[row]) for row in range(y_num_row)])X_test = np.array([standardize(X_raw_test[row,:], X_scalers[row]) for row in range(X_num_row)])

Check if data has been standardized

y_test = np.array([standardize(y_raw_test[row,:], y_scalers[row]) for row in range(y_num_row)])print([X_train[row,:].mean() for row in range(X_num_row)]) # should be close to zero

print([X_train[row,:].std() for row in range(X_num_row)]) # should be close to oneprint([y_train[row,:].mean() for row in range(y_num_row)]) # should be close to zero

print([y_train[row,:].std() for row in range(y_num_row)]) # should be close to one

The scaler therefore does not contain any information from our testing set. We do not want our neural net to gain any information regarding testing set before network tuning.

We have now completed the data pre-processing procedures in ** 4 steps**.

Photo by freestocks.org on Unsplash

We objectify a ‘layer’ using class in Python. Every layer (except the input layer) has a weight matrix **W**, a bias vector ** b**, and an activation function. Each layer is appended to a list called

class layer:

definit(self, layer_index, is_output, input_dim, output_dim, activation):

self.layer_index = layer_index # zero indicates input layer

self.is_output = is_output # true indicates output layer, false otherwise

self.input_dim = input_dim

self.output_dim = output_dim

self.activation = activationChange layers_dim to configure your own neural net!`# the multiplication constant is sorta arbitrary if layer_index != 0: self.W = np.random.randn(output_dim, input_dim) * np.sqrt(2/input_dim) self.b = np.random.randn(output_dim, 1) * np.sqrt(2/input_dim)`

layers_dim = [X_num_row, 4, 4, y_num_row] # input layer --- hidden layers --- output layers

Construct the net layer by layer

neural_net = []for layer_index in range(len(layers_dim)):

Simple check on overfitting

if layer_index == 0: # if input layer

neural_net.append(layer(layer_index, False, 0, layers_dim[layer_index], 'irrelevant'))

elif layer_index+1 == len(layers_dim): # if output layer

neural_net.append(layer(layer_index, True, layers_dim[layer_index-1], layers_dim[layer_index], activation='linear'))

else:

neural_net.append(layer(layer_index, False, layers_dim[layer_index-1], layers_dim[layer_index], activation='relu'))pred_n_param = sum([(layers_dim[layer_index]+1)*layers_dim[layer_index+1] for layer_index in range(len(layers_dim)-1)])

act_n_param = sum([neural_net[layer_index].W.size + neural_net[layer_index].b.size for layer_index in range(1,len(layers_dim))])

print(f'Predicted number of hyperparameters: {pred_n_param}')

print(f'Actual number of hyperparameters: {act_n_param}')

print(f'Number of data: {X_num_col}')if act_n_param >= X_num_col:

raise Exception('It will overfit.')

Finally, we do a sanity check on the number of hyperparameters using the following formula, and by counting. The number of datums available should exceed the number of hyperparameters, otherwise it will definitely overfit.

N^l is number of hyperparameters at l-th layer, L is number of layers (excluding input layer)

6. Forward PropagationWe define a function for forward propagation given a certain set of weights and biases. The connection between layers is defined in matrix form as:

σ is element-wise activation function, superscript T means transpose of a matrix

Activation functions are defined one by one. ReLU is implemented as ** a → max(a,0)**, whereas sigmoid function should return

def activation(input_, act_func):

if act_func == 'relu':

return np.maximum(input_, np.zeros(input_.shape))

elif act_func == 'linear':

return input_

else:

raise Exception('Activation function is not defined.')def forward_prop(input_vec, layers_dim=layers_dim, neural_net=neural_net):

neural_net[0].A = input_vec # Define A in input layer for for-loop convenience

for layer_index in range(1,len(layers_dim)): # W,b,Z,A are undefined in input layer

neural_net[layer_index].Z = np.add(np.dot(neural_net[layer_index].W, neural_net[layer_index-1].A), neural_net[layer_index].b)

neural_net[layer_index].A = activation(neural_net[layer_index].Z, neural_net[layer_index].activation)

return neural_net[layer_index].A

Photo by Holger Link on Unsplash

This is the most tricky part where many of us simply do not understand. Once we have defined a loss metric *e* for evaluating performance, we would like to know how the loss metric change when we perturb each weight or bias.

We want to know how sensitive each weight and bias is with respect to the loss metric.

This is represented by partial derivatives **∂e/∂W** (denoted dW in code) and **∂e/∂b** (denoted db in code) respectively, and can be calculated analytically.

⊙ represents element-wise multiplication

These back-propagation equations assume only one datum *y* is compared. The gradient update process would be very noisy as the performance of each iteration is subject to one datum point only. Multiple datums can be used to reduce the noise where **∂W(y 1, y2, …) **would be the mean of

def get_loss(y, y_hat, metric='mse'):

if metric == 'mse':

individual_loss = 0.5 * (y_hat - y) ** 2

return np.mean([np.linalg.norm(individual_loss[:,col], 2) for col in range(individual_loss.shape[1])])

else:

raise Exception('Loss metric is not defined.')def get_dZ_from_loss(y, y_hat, metric):

if metric == 'mse':

return y_hat - y

else:

raise Exception('Loss metric is not defined.')def get_dactivation(A, act_func):

if act_func == 'relu':

return np.maximum(np.sign(A), np.zeros(A.shape)) # 1 if backward input >0, 0 otherwise; then diaganolize

elif act_func == 'linear':

return np.ones(A.shape)

else:

raise Exception('Activation function is not defined.')def backward_prop(y, y_hat, metric='mse', layers_dim=layers_dim, neural_net=neural_net, num_train_datum=num_train_datum):

for layer_index in range(len(layers_dim)-1,0,-1):

if layer_index+1 == len(layers_dim): # if output layer

dZ = get_dZ_from_loss(y, y_hat, metric)

else:

dZ = np.multiply(np.dot(neural_net[layer_index+1].W.T, dZ),

get_dactivation(neural_net[layer_index].A, neural_net[layer_index].activation))

dW = np.dot(dZ, neural_net[layer_index-1].A.T) / num_train_datum

db = np.sum(dZ, axis=1, keepdims=True) / num_train_datum`neural_net[layer_index].dW = dW neural_net[layer_index].db = db`

We now have every building block for training a neural network.

Once we know the sensitivities of weights and biases, we try to ** minimize** (hence the minus sign) the loss metric iteratively by gradient descent using the following update rule:

W = W - learning_rate * ∂W

b = b - learning_rate * ∂b

Photo by Rostyslav Savchyn on Unsplash

learning_rate = 0.01

max_epoch = 1000000for epoch in range(1,max_epoch+1):

y_hat_train = forward_prop(X_train) # update y_hat

backward_prop(y_train, y_hat_train) # update (dW,db)`for layer_index in range(1,len(layers_dim)): # update (W,b) neural_net[layer_index].W = neural_net[layer_index].W - learning_rate * neural_net[layer_index].dW neural_net[layer_index].b = neural_net[layer_index].b - learning_rate * neural_net[layer_index].db if epoch % 100000 == 0: print(f'{get_loss(y_train, y_hat_train):.4f}')`

Training loss should be going down as it iterates

9. TestingThe model generalizes well if the testing loss is not much higher than the training loss. We also make some test cases to see how the model performs.

# test lossprint(get_loss(y_test, forward_prop(X_test)))

def predict(X_raw_any):

X_any = np.array([standardize(X_raw_any[row,:], X_scalers[row]) for row in range(X_num_row)])

y_hat = forward_prop(X_any)

y_hat_any = np.array([unstandardize(y_hat[row,:], y_scalers[row]) for row in range(y_num_row)])

return y_hat_anypredict(np.array([[30,70],[70,30],[3,5],[888,122]]).T)

This is how you can build a neural net from scratch using NumPy in ** 9 steps**.

My implementation by no means is the most efficient way to build and train a neural net. There is so much room for improvement but that is a story for another day. Codes are available on Github. Happy coding!

**Thanks for reading** ❤

If you liked this post, share it with all of your programming buddies!

Follow us on **Facebook** | **Twitter**

☞ The Data Science Course 2019: Complete Data Science Bootcamp

☞ Machine Learning A-Z™: Hands-On Python & R In Data Science

☞ Tableau 10 A-Z: Hands-On Tableau Training For Data Science!

☞ R Programming A-Z™: R For Data Science With Real Exercises!

☞ Machine Learning, Data Science and Deep Learning with Python