Building a Layer Two Neural Network From Scratch Using Python

An in-depth tutorial on setting up an AI network

Published in

Better Programming

5 min readJun 21, 2019

Hello AI fans! I am so excited to share with you how to build a neural network with a hidden layer! Follow along and let’s get started!

Importing Libraries

The only library we need for this tutorial is NumPy.

import numpy as np

Activation Function

In the hidden layer, we will use the tanh activation function and in the output layer, I will use the sigmoid function. It is easy to find information on both the sigmoid function and the tanh function graph. I don’t want to bore you with explanations, so I will just implement it.

Sigmoid function. The code for this function is given below.

def sigmoid(x):
    return (1 / (1 + np.exp(-x)))

Setting Parameters

What are parameters and hyperparameters? Parameters are weights and biases. Hyperparameters effect parameters and are before the learning begins. Setting hyperparameters perfectly correctly at first is not a piece of cake, you’ll need to tinker and tweak your values. The learning rate, number of iterations, and regularization rate, among others, can all be considered as hyperparameters.

Wondering how to set the matrices sizes? The answer just below!

What does all that mean? For example:
(layer 0 so L = 0) number of neurons in input layers = 3
(layer 1 so L = 1) number of neurons in hidden layers = 5
(layer 2 so L = 2) number of neurons in output layers = 1

I hope this all makes sense! Let’s set the parameters:

We define W1, b1, W2, and b2. It doesn’t hurt if you set your biases to zero at first. However, be very careful when initializing weights. Never set the weights to zero at first. Why exactly? Well, if you do, then in Z = Wx + b, Z will always be zero. If you are building a multi-layer neural network, neurons in every layer will behave like there is one neuron. So how do we initialize weights at first? I use he initialization.

# Python implementation
np.random.randn(output_size, hidden_size)*np.sqrt(2/hidden_size)

You don’t have to use he initialization, you can also use this:

np.random.randn(output_size, hidden_size)*0.01

I’d recommend never setting weights to zero or a big number when initializing parameters.

Forward Propagation

The diagram above should give you a good idea of what forward propagation is. The implementation in Python is:

Why we are storing {‘Z1’: Z1, ‘Z2’: Z2, ‘A1’: A1, ‘y’: y}? Because we will use them when back-propagating.

Cost function

We just looked at forward propagation and obtained a prediction (y). We calculate it using a cost function. The below graph explains:

We update our parameters and find the best parameter that gives us the minimum possible cost. I’m not going to delve into derivatives, but note that on the graph above, if you are on the right sight of the parabola, the derivative (slope) will be positive, so the parameter will decrease and move left approaching the parameter that returns the minimum cost. On the left side, the slope will be negative, so the parameter increases towards the value we want. Let’s look at the cost function we will use:

Python code for cost function:

Backpropagation

We’ve found the cost, now let’s go back and find the derivative of our weights and biases. In a future piece, I plan to show you how to derivate them step by step.

What are the params and cache in def backPropagation(X, Y, params, cache)? When we use forward propagation, we store values to use during backpropagation. Params are parameters (weight and biases).

Updating Parameters

Now that we have our derivatives, we can use the equation below:

In that equation, alpha (α) is the learning rate hyperparameter. We need to set it to some value before the learning begins. The term to the right of the learning rate is the derivative. We know alpha and derivatives, let’s update our parameters.

All About Loops

We need to run many interations to find the parameters that return the minimum cost. Let’s loops it!

Hidden_size means the number of neurons in the hidden layer. It looks like a hyperparameter. Because you set it before learning begins! What return params, cost_ tells us. params are the best parameters we found and cost_ is just cost we estimated in every episode.

Let’s Try Our Code!

Use sklearn to create a dataset.

import sklearn.datasets
X, Y = sklearn.datasets.make_moons(n_samples=500, noise=.2)
X, Y = X.T, Y.reshape(1, Y.shape[0])

X input, Y actual output.

params, cost_ = fit(X, Y, 0.3, 5, 5000)

I set the learning rate to 0.3, the number of neurons in the hidden layer to 5 and the number of iterations to 5000.

Feel free to try with different values.

Let’s draw a graph showing how the cost function changed with every episode:

import matplotlib.pyplot as plt
plt.plot(cost_)

Bingo! We did it!

first_cost = 0.7383781203733911
last_cost = 0.06791109327547613

Full code:

Thank you for reading! I hope this tutorial was helpful!

Better Programming

Building a Layer Two Neural Network From Scratch Using Python

An in-depth tutorial on setting up an AI network

Importing Libraries

Activation Function

Setting Parameters

Forward Propagation

Cost function

Backpropagation

Updating Parameters

All About Loops

Let’s Try Our Code!

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Better Programming

Written by Halil Yıldırım

Responses (3)

More from Halil Yıldırım and Better Programming

Stop using Alpine Docker images

Everybody loves Alpine images because they are light and have a smaller attack surface, but maybe they are not the best option anymore.

How we hire: Interviewing for a role at SumUp

We often get asked by candidates about our hiring process. We understand that when someone asks questions about how we hire, they are…

Containers from Scratch — Part 1

There is no better way to learn something than by building it. Let's understand and build a container from scratch.

Introducing Wrapcheck: An error wrapping linter for Go

Hunting down errors in Go can be difficult without context. This article introduces a new linter to help keep consistency.

Recommended from Medium

When Autistic Trust Backfires: How I Became the Villain in My Own Story

Navigating social missteps, misplaced vulnerability, and the gut-wrenching realization that I was never truly safe.

The Invisible Life of the Unidentified Autistic Girl

Autism in Heels by Jennifer Cook O’Toole Book Review

Autism? Doesn’t Matter. Again.

Some people just don’t get it, and autistics are forced to accommodate them–instead of the other way around

When I Go Full ADHD Energy Gremlin

I should have known.

Before the Diagnosis: “The Grand Unified Theory of Denise”

My old writings hint at my new diagnosis

AuDHD and Clearing the Clutter: Start with a File Box

Part 1 of how to work around shutdowns, to clear even a hoarder’s mess, and get through difficult tasks.