Build your own AlphaZero AI using Python and Keras

Build your own AlphaZero AI using Python and Keras

How to build your own AlphaZero AI using Python and Keras

Teach a machine to learn Connect4 strategy through self-play and deep learning

In this article I’ll attempt to cover three things:

  1. Two reasons why AlphaZero is a massive step forward for Artificial Intelligence
  2. How you can build a replica of the AlphaZero methodology to play the game Connect4
  3. How you can adapt the code to plug in other games

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!### AlphaGo → AlphaGo Zero → AlphaZero

In March 2016, Deepmind’s AlphaGo beat 18 times world champion Go player Lee Sedol 4–1 in a series watched by over 200 million people. A machine had learnt a super-human strategy for playing Go, a feat previously thought impossible, or at the very least, at least a decade away from being accomplished.

This in itself, was a remarkable achievement. However, on 18th October 2017, DeepMind took a giant leap further.

The paper Mastering the Game of Go without Human Knowledge unveiled a new variant of the algorithm, AlphaGo Zero, that had defeated AlphaGo 100–0. Incredibly, it had done so by learning solely through self-play, starting ‘tabula rasa’ (blank state) and gradually finding strategies that would beat previous incarnations of itself. No longer was a database of human expert games required to build a super-human AI .

A mere 48 days later, on 5th December 2017, DeepMind released another paper ‘Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm’ showing how AlphaGo Zero could be adapted to beat the world-champion programs StockFish and Elmo at chess and shogi. The entire learning process, from being shown the games for the first time, to becoming the best computer program in the world, had taken under 24 hours.

With this, AlphaZero was born — the general algorithm for getting good at something, quickly, without any prior knowledge of human expert strategy.

There are two amazing things about this achievement:

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
It cannot be overstated how important this is. This means that the underlying methodology of AlphaGo Zero can be applied to ANYgame with perfect information (the game state is fully known to both players at all times) because no prior expertise is required beyond the rules of the game.

This is how it was possible for DeepMind to publish the chess and shogi papers only 48 days after the original AlphaGo Zero paper. Quite literally, all that needed to change was the input file that describes the mechanics of the game and to tweak the hyper-parameters relating to the neural network and Monte Carlo tree search.

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
If AlphaZero used super-complex algorithms that only a handful of people in the world understood, it would still be an incredible achievement. What makes it extraordinary is that a lot of the ideas in the paper are actually far less complex than previous versions. At its heart, lies the following beautifully simple mantra for learning:
First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
Doesn’t that sound a lot like how you learn to play games? When you play a bad move, it’s either because you misjudged the future value of resulting positions, or you misjudged the likelihood that your opponent would play a certain move, so didn’t think to explore that possibility. These are exactly the two aspects of gameplay that AlphaZero is trained to learn.

How to build your own AlphaZero

Firstly, check out the AlphaGo Zero cheat sheet for a high level understanding of how AlphaGo Zero works. It’s worth having that to refer to as we walk through each part of the code. There’s also a great article here that explains how AlphaZero works in more detail.

The code

Clone this Git repository, which contains the code I’ll be referencing.

To start the learning process, run the top two panels in the run.ipynb Jupyter notebook. Once it’s built up enough game positions to fill its memory the neural network will begin training. Through additional self-play and training, it will gradually get better at predicting the game value and next moves from any position, resulting in better decision making and smarter overall play.

We’ll now have a look at the code in more detail, and show some results that demonstrate the AI getting stronger over time.

N.B — This is my own understanding of how AlphaZero works based on the information available in the papers referenced above. If any of the below is incorrect, apologies and I’ll endeavour to correct it!


The game that our algorithm will learn to play is Connect4 (or Four In A Row). Not quite as complex as Go… but there are still 4,531,985,219,092 game positions in total.

The game rules are straightforward. Players take it in turns to enter a piece of their colour in the top of any available column. The first player to get four of their colour in a row — each vertically, horizontally or diagonally, wins. If the entire grid is filled without a four-in-a-row being created, the game is drawn.

Here’s a summary of the key files that make up the codebase:

This file contains the game rules for Connect4.

Each squares is allocated a number from 0 to 41, as follows:

The file gives the logic behind moving from one game state to another, given a chosen action. For example, given the empty board and action 38, the takeAction method return a new game state, with the starting player’s piece at the bottom of the centre column.

You can replace the file with any game file that conforms to the same API and the algorithm will in principal, learn strategy through self play, based on the rules you have given it.


This contains the code that starts the learning process. It loads the game rules and then iterates through the main loop of the algorithm, which consist of three stages:

  1. Two reasons why AlphaZero is a massive step forward for Artificial Intelligence
  2. How you can build a replica of the AlphaZero methodology to play the game Connect4
  3. How you can adapt the code to plug in other games

There are two agents involved in this loop, the best_player and the current_player.

The best_player contains the best performing neural network and is used to generate the self play memories. The current_player then retrains its neural network on these memories and is then pitched against the best_player. If it wins, the neural network inside the best_player is switched for the neural network inside the current_player, and the loop starts again.

This contains the Agent class (a player in the game). Each player is initialised with its own neural network and Monte Carlo Search Tree.

The simulate method runs the Monte Carlo Tree Search process. Specifically, the agent moves to a leaf node of the tree, evaluates the node with its neural network and then backfills the value of the node up through the tree.

The act method repeats the simulation multiple times to understand which move from the current position is most favourable. It then returns the chosen action to the game, to enact the move.

The replay method retrains the neural network, using memories from previous games.

This file contains the Residual_CNN class, which defines how to build an instance of the neural network.

It uses a condensed version of the neural network architecture in the AlphaGoZero paper — i.e. a convolutional layer, followed by many residual layers, then splitting into a value and policy head.

The depth and number of convolutional filters can be specified in the config file.

The Keras library is used to build the network, with a backend of Tensorflow.

To view individual convolutional filters and densely connected layers in the neural network, run the following inside the the run.ipynb notebook:


This contains the Node, Edge and MCTS classes, that constitute a Monte Carlo Search Tree.

The MCTS class contains the moveToLeaf and backFill methods previously mentioned, and instances of the Edge class store the statistics about each potential move.

This is where you set the key parameters that influence the algorithm.

Adjusting these variables will affect that running time, neural network accuracy and overall success of the algorithm. The above parameters produce a high quality Connect4 player, but take a long time to do so. To speed the algorithm up, try the following parameters instead.

Contains the playMatches and playMatchesBetweenVersions functions that play matches between two agents.

To play against your creation, run the following code (it’s also in the run.ipynb notebook)

from game import Game
from funcs import playMatchesBetweenVersions
import loggers as lg
env = Game()
, 1  # the run version number where the computer player is located
, -1 # the version number of the first player (-1 for human)
, 12 # the version number of the second player (-1 for human)
, 10 # how many games to play
, lg.logger_tourney # where to log the game to
, 0  # which player to go first - 0 for random

When you run the algorithm, all model and memory files are saved in the run folder, in the root directory.

To restart the algorithm from this checkpoint later, transfer the run folder to the run_archive folder, attaching a run number to the folder name. Then, enter the run number, model version number and memory version number into the file, corresponding to the location of the relevant files in the run_archive folder. Running the algorithm as usual will then start from this checkpoint.

An instance of the Memory class stores the memories of previous games, that the algorithm uses to retrain the neural network of the current_player.

This file contains a custom loss function, that masks predictions from illegal moves before passing to the cross entropy loss function.

The locations of the run and run_archive folders.

Log files are saved to the log folder inside the run folder.

To turn on logging, set the values of the logger_disabled variables to False inside this file.

Viewing the log files will help you to understand how the algorithm works and see inside its ‘mind’. For example, here is a sample from the logger.mcts file.

Equally from the logger.tourney file, you can see the probabilities attached to each move, during the evaluation phase:


Training over a couple of days produces the following chart of loss against mini-batch iteration number:

The top line is the error in the policy head (the cross entropy of the MCTS move probabilities, against the output from the neural network). The bottom line is the error in the value head (the mean squared error between the actual game value and the neural network predict of the value). The middle line is an average of the two.

Clearly, the neural network is getting better at predicting the value of each game state and the likely next moves. To show how this results in stronger and stronger play, I ran a league between 17 players, ranging from the 1st iteration of the neural network, up to the 49th. Each pairing played twice, with both players having a chance to play first.

Here are the final standings:

Clearly, the later versions of the neural network are superior to the earlier versions, winning most of their games. It also appears that the learning hasn’t yet saturated — with further training time, the players would continue to get stronger, learning more and more intricate strategies.

As an example, one clear strategy that the neural network has favoured over time is grabbing the centre column early. Observe the difference between the first version of the algorithm and say, the 30th version:

1st neural network version

30th neural network version

This is a good strategy as many lines require the centre column — claiming this early ensures your opponent cannot take advantage of this. This has been learnt by the neural network, without any human input.

Learning a different game

There is a file for a game called ‘Metasquares’ in the games folder. This involves placing X and O markers in a grid to try to form squares of different sizes. Larger squares score more points than smaller squares and the player with the most points when the grid is full wins.

If you switch the Connect4 file for the Metasquares file, the same algorithm will learn how to play Metasquares instead.


Hopefully you find this article useful — let me know in the comments below if you find any typos or have questions about anything in the codebase or article and I’ll get back to you as soon as possible.

If you would like to learn more about how our company, Applied Data Science develops innovative data science solutions for businesses, feel free to get in touch through our website or directly through LinkedIn.

Applied Data Science is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, let’s talk.

30s ad

*Originally published at *

Artificial Intelligence A-Z™: Learn How To Build An AI

Artificial Intelligence 2018: Build the Most Powerful AI

ChatBots: Messenger ChatBot with API.AI and Node.JS

ChatBots: Messenger ChatBot with API.AI and Node.JS

Intelligent Mobile Apps with Ionic and API.AI (DialogFlow)

Top Machine Learning Framework: 5 Machine Learning Frameworks of 2019

Top Machine Learning Framework: 5 Machine Learning Frameworks of 2019

Machine Learning (ML) is one of the fastest-growing technologies today. ML has a lot of frameworks to build a successful app, and so as a developer, you might be getting confused about using the right framework. Herein we have curated top 5...

Machine Learning (ML) is one of the fastest-growing technologies today. ML has a lot of frameworks to build a successful app, and so as a developer, you might be getting confused about using the right framework. Herein we have curated top 5 machine learning frameworks that are cutting edge technology in your hands.

Through the machine learning frameworks, mobile phones and tablets are getting powerful enough to run the software that can learn and react in real-time. It is a complex discipline. But the implementation of ML models is far less daunting and difficult than it used to be. Now, it automatically improves the performance with the pace of time, interactions, and experiences, and the most important acquisition of useful data pertaining to the tasks allocated.

As we know that ML is considered as a subset of Artificial Intelligence (AI). The scientific study of statistical models and algorithms help a computing system to accomplish designated tasks efficiently. Now, as a mobile app developer, when you are planning to choose machine learning frameworks you must keep the following things in mind.

The framework should be performance-oriented
The grasping and coding should be quick
It allows to distribute the computational process, the framework must have parallelization
It should consist of a facility to create models and provide a developer-friendly tool
Let’s learn about the top five machine learning frameworks to make the right choice for your next ML application development project. Before we dive deeper into these mentioned frameworks, know the different types of ML frameworks that are available on the web. Here are some ML frameworks:

Mathematical oriented
Neural networks-based
Linear algebra tools
Statistical tools
Now, let’s have an insight into ML frameworks that will help you in selecting the right framework for your ML application.

Don’t Miss Out on These 5 Machine Learning Frameworks of 2019
#1 TensorFlow
TensorFlow is an open-source software library for data-based programming across multiple tasks. The framework is based on computational graphs which is essentially a network of codes. Each node represents a mathematical operation that runs some function as simple or as complex as multivariate analysis. This framework is said to be best among all the ML libraries as it supports regressions, classifications, and neural networks like complicated tasks and algorithms.

machine learning frameworks
This machine learning library demands additional efforts while learning TensorFlow Python framework. Your job becomes easy in the n-dimensional array of the framework when you have grasped the Python frameworks and libraries.

The benefits of this framework are flexibility. TensorFlow allows non-automatic migration to newer versions. It runs on the GPU, CPU, servers, desktops, and mobile devices. It provides auto differentiation and performance. There are a few goliaths like Airbus, Twitter, IBM, who have innovatively used the TensorFlow frameworks.

#2 FireBase ML Kit
Firebase machine learning framework is a library that allows effortless, minimal code, with highly accurate, pre-trained deep models. We at Space-O Technologies use this machine learning technology for image classification and object detection. The Firebase framework offers models both locally and on the Google Cloud.

machine learning frameworks
This is one of our ML tutorials to make you understand the Firebase frameworks. First of all, we collected photos of empty glass, half watered glass, full watered glass, and targeted into the machine learning algorithms. This helped the machine to search and analyze according to the nature, behavior, and patterns of the object placed in front of it.

The first photo that we targeted through machine learning algorithms was to recognize an empty glass. Thus, the app did its analysis and search for the correct answer, we provided it with certain empty glass images prior to the experiment.
The other photo that we targeted was a half water glass. The core of the machine learning app is to assemble data and to manage it as per its analysis. It was able to recognize the image accurately because of the little bits and pieces of the glass given to it beforehand.
The last one is a full glass recognition image.
Note: For correct recognition, there has to be 1 label that carries at least 100 images of a particular object.

#3 CAFFE (Convolutional Architecture for Fast Feature Embedding)
CAFFE framework is the fastest way to apply deep neural networks. It is the best machine learning framework known for its model-Zoo a pre-trained ML model that is capable of performing a great variety of tasks. Image classification, machine vision, recommender system are some of the tasks performed easily through this ML library.

machine learning frameworks
This framework is majorly written in CPP. It can run on multiple hardware and can switch between CPU and GPU with the use of a single flag. It has systematically organized the structure of Mat lab and python interface.

Now, if you have to make a machine learning app development, then it is mainly used in academic research projects and to design startups prototypes. It is the aptest machine learning technology for research experiments and industry deployment. At a time this framework can manage 60 million pictures every day with a solitary Nvidia K40 GPU.

#4 Apache Spark
The Apache Spark machine learning is a cluster-computing framework written in different languages like Java, Scala, R, and Python. Spark’s machine learning library, MLlib is considered as foundational for the Spark’s success. Building MLlib on top of Spark makes it possible to tackle the distinct needs of a single tool instead of many disjointed ones.

machine learning frameworks
The advantages of such ML library lower learning curves, less complex development and production environments, which ultimately results in a shorter time to deliver high-performing models. The key benefit of MLlib is that it allows data scientists to solve multiple data problems in addition to their machine learning problems.

It can easily solve graph computations (via GraphX), streaming (real-time calculations), and real-time interactive query processing with Spark SQL and DataFrames. The data professionals can focus on solving the data problems instead of learning and maintaining a different tool for each scenario.

#5 Scikit-Learn
Scikit-learn is said to be one of the greatest feats of Python community. This machine learning framework efficiently handles data mining and supports multiple practical tasks. It is built on foundations like SciPy, Numpy, and matplotlib. This framework is known for supervised & unsupervised learning algorithms as well as cross-validation. The Scikit learn is largely written in Python with some core algorithms in Cython to achieve performance.

machine learning frameworks
The machine learning framework can work on multiple tasks without compromising on speed. There are some remarkable machine learning apps using this framework like Spotify, Evernote, AWeber, Inria.

With the help of machine learning to build iOS apps, Android apps powered by ML have become quite an easy process. With this emerging technology trend varieties of available data, computational processing has become cheaper and more powerful, and affordable data storage. So being an app developer or having an idea for machine learning apps should definitely dive into the niche.

Still have any query or confusion regarding ML frameworks, machine learning app development guide, the difference between Artificial Intelligence and machine learning, ML algorithms from scratch, how this technology is helpful for your business? Just fill our contact us form. Our sales representatives will get back to you shortly and resolve your queries. The consultation is absolutely free of cost.

Author Bio: This blog is written with the help of Jigar Mistry, who has over 13 years of experience in the web and mobile app development industry. He has guided to develop over 200 mobile apps and has special expertise in different mobile app categories like Uber like apps, Health and Fitness apps, On-Demand apps and Machine Learning apps. So, we took his help to write this complete guide on machine learning technology and machine app development areas.

Introduction to Machine Learning with TensorFlow.js

Introduction to Machine Learning with TensorFlow.js

Learn how to build and train Neural Networks using the most popular Machine Learning framework for javascript, TensorFlow.js.

Learn how to build and train Neural Networks using the most popular Machine Learning framework for javascript, TensorFlow.js.

This is a practical workshop where you'll learn "hands-on" by building several different applications from scratch using TensorFlow.js.

If you have ever been interested in Machine Learning, if you want to get a taste for what this exciting field has to offer, if you want to be able to talk to other Machine Learning/AI specialists in a language they understand, then this workshop is for you.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Machine Learning and TensorFlow.js

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning In Node.js With TensorFlow.js

Machine Learning in JavaScript with TensorFlow.js

A Complete Machine Learning Project Walk-Through in Python

Top 10 Machine Learning Algorithms You Should Know to Become a Data Scientist

TensorFlow Vs PyTorch: Comparison of the Machine Learning Libraries

TensorFlow Vs PyTorch: Comparison of the Machine Learning Libraries

Libraries play an important role when developers decide to work in Machine Learning or Deep Learning researches. In this article, we list down 10 comparisons between TensorFlow and PyTorch these two Machine Learning Libraries.

According to this article, a survey based on a sample of 1,616 ML developers and data scientists, for every one developer using PyTorch, there are 3.4 developers using TensorFlow. In this article, we list down 10 comparisons between these two Machine Learning Libraries

1 - Origin

PyTorch has been developed by Facebook which is based on Torch while TensorFlow, an open sourced Machine Learning Library, developed by Google Brain is based on the idea of data flow graphs for building models.

2 - Features

TensorFlow has some attracting features such as TensorBoard which serves as a great option while visualising a Machine Learning model, it also has TensorFlow Serving which is a specific grpc server that is used during the deployment of models in production. On the other hand, PyTorch has several distinguished features too such as dynamic computation graphs, naive support for Python, support for CUDA which ensures less time for running the code and increase in performance.

3 - Community

TensorFlow is adopted by many researchers of various fields like academics, business organisations, etc. It has a much bigger community than PyTorch which implies that it is easier to find for resources or solutions in TensorFlow. There is a vast amount of tutorials, codes, as well as support in TensorFlow and PyTorch, being the newcomer into play as compared to TensorFlow, it lacks these benefits.

4 - Visualisation

Visualisation plays as a protagonist while presenting any project in an organisation. TensorFlow has TensorBoard for visualising Machine Learning models which helps during training the model and spot the errors quickly. It is a real-time representation of the graphs of a model which not only depicts the graphic representation but also shows the accuracy graphs in real-time. This eye-catching feature is lacked by PyTorch.

5 - Defining Computational Graphs

In TensorFlow, defining computational graph is a lengthy process as you have to build and run the computations within sessions. Also, you will have to use other parameters such as placeholders, variable scoping, etc. On the other hand, Python wins this point as it has the dynamic computation graphs which help id building the graphs dynamically. Here, the graph is built at every point of execution and you can manipulate the graph at run-time.

6 - Debugging

PyTorch being the dynamic computational process, the debugging process is a painless method. You can easily use Python debugging tools like pdb or ipdb, etc. for instance, you can put “pdb.set_trace()” at any line of code and then proceed for executions of further computations, pinpoint the cause of the errors, etc. While, for TensorFlow you have to use the TensorFlow debugger tool, tfdbg which lets you view the internal structure and states of running TensorFlow graphs during training and inference.

7 - Deployment

For now, deployment in TensorFlow is much more supportive as compared to PyTorch. It has the advantage of TensorFlow Serving which is a flexible, high-performance serving system for deploying Machine Learning models, designed for production environments. However, in PyTorch, you can use the Microframework for Python, Flask for deployment of models.

8 - Documentation

The documentation of both frameworks is broadly available as there are examples and tutorials in abundance for both the libraries. You can say, it is a tie between both the frameworks.

Click here for TensorFlow documentation and click here for PyTorch documentation.

9 - Serialisation

The serialisation in TensorFlow can be said as one of the advantages for this framework users. Here, you can save your entire graph as a protocol buffer and then later it can be loaded in other supported languages, however, PyTorch lacks this feature. 

10 - Device Management

By default, Tensorflow maps nearly all of the GPU memory of all GPUs visible to the process which is a comedown but here it automatically presumes that you want to run your code on the GPU because of the well-set defaults and thus result in fair management of the device. On the other hand, PyTorch keeps track of the currently selected GPU and all the CUDA tensors which will be allocated.