Nina Diana

Nina Diana


Build your own AlphaZero AI using Python and Keras

Teach a machine to learn Connect4 strategy through self-play and deep learning

In this article I’ll attempt to cover three things:

  1. Two reasons why AlphaZero is a massive step forward for Artificial Intelligence
  2. How you can build a replica of the AlphaZero methodology to play the game Connect4
  3. How you can adapt the code to plug in other games

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!### AlphaGo → AlphaGo Zero → AlphaZero

In March 2016, Deepmind’s AlphaGo beat 18 times world champion Go player Lee Sedol 4–1 in a series watched by over 200 million people. A machine had learnt a super-human strategy for playing Go, a feat previously thought impossible, or at the very least, at least a decade away from being accomplished.

This in itself, was a remarkable achievement. However, on 18th October 2017, DeepMind took a giant leap further.

The paper Mastering the Game of Go without Human Knowledge unveiled a new variant of the algorithm, AlphaGo Zero, that had defeated AlphaGo 100–0. Incredibly, it had done so by learning solely through self-play, starting ‘tabula rasa’ (blank state) and gradually finding strategies that would beat previous incarnations of itself. No longer was a database of human expert games required to build a super-human AI .

A mere 48 days later, on 5th December 2017, DeepMind released another paper ‘Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm’ showing how AlphaGo Zero could be adapted to beat the world-champion programs StockFish and Elmo at chess and shogi. The entire learning process, from being shown the games for the first time, to becoming the best computer program in the world, had taken under 24 hours.

With this, AlphaZero was born — the general algorithm for getting good at something, quickly, without any prior knowledge of human expert strategy.

There are two amazing things about this achievement:

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
It cannot be overstated how important this is. This means that the underlying methodology of AlphaGo Zero can be applied to ANYgame with perfect information (the game state is fully known to both players at all times) because no prior expertise is required beyond the rules of the game.

This is how it was possible for DeepMind to publish the chess and shogi papers only 48 days after the original AlphaGo Zero paper. Quite literally, all that needed to change was the input file that describes the mechanics of the game and to tweak the hyper-parameters relating to the neural network and Monte Carlo tree search.

First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
If AlphaZero used super-complex algorithms that only a handful of people in the world understood, it would still be an incredible achievement. What makes it extraordinary is that a lot of the ideas in the paper are actually far less complex than previous versions. At its heart, lies the following beautifully simple mantra for learning:
First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!> First, a quick note about a new platform, The Network — a place where data scientists can find paid contract projects with businesses!
Doesn’t that sound a lot like how you learn to play games? When you play a bad move, it’s either because you misjudged the future value of resulting positions, or you misjudged the likelihood that your opponent would play a certain move, so didn’t think to explore that possibility. These are exactly the two aspects of gameplay that AlphaZero is trained to learn.

How to build your own AlphaZero

Firstly, check out the AlphaGo Zero cheat sheet for a high level understanding of how AlphaGo Zero works. It’s worth having that to refer to as we walk through each part of the code. There’s also a great article here that explains how AlphaZero works in more detail.

The code

Clone this Git repository, which contains the code I’ll be referencing.

To start the learning process, run the top two panels in the run.ipynb Jupyter notebook. Once it’s built up enough game positions to fill its memory the neural network will begin training. Through additional self-play and training, it will gradually get better at predicting the game value and next moves from any position, resulting in better decision making and smarter overall play.

We’ll now have a look at the code in more detail, and show some results that demonstrate the AI getting stronger over time.

N.B — This is my own understanding of how AlphaZero works based on the information available in the papers referenced above. If any of the below is incorrect, apologies and I’ll endeavour to correct it!


The game that our algorithm will learn to play is Connect4 (or Four In A Row). Not quite as complex as Go… but there are still 4,531,985,219,092 game positions in total.

The game rules are straightforward. Players take it in turns to enter a piece of their colour in the top of any available column. The first player to get four of their colour in a row — each vertically, horizontally or diagonally, wins. If the entire grid is filled without a four-in-a-row being created, the game is drawn.

Here’s a summary of the key files that make up the codebase:

This file contains the game rules for Connect4.

Each squares is allocated a number from 0 to 41, as follows:

The file gives the logic behind moving from one game state to another, given a chosen action. For example, given the empty board and action 38, the takeAction method return a new game state, with the starting player’s piece at the bottom of the centre column.

You can replace the file with any game file that conforms to the same API and the algorithm will in principal, learn strategy through self play, based on the rules you have given it.


This contains the code that starts the learning process. It loads the game rules and then iterates through the main loop of the algorithm, which consist of three stages:

  1. Two reasons why AlphaZero is a massive step forward for Artificial Intelligence
  2. How you can build a replica of the AlphaZero methodology to play the game Connect4
  3. How you can adapt the code to plug in other games

There are two agents involved in this loop, the best_player and the current_player.

The best_player contains the best performing neural network and is used to generate the self play memories. The current_player then retrains its neural network on these memories and is then pitched against the best_player. If it wins, the neural network inside the best_player is switched for the neural network inside the current_player, and the loop starts again.

This contains the Agent class (a player in the game). Each player is initialised with its own neural network and Monte Carlo Search Tree.

The simulate method runs the Monte Carlo Tree Search process. Specifically, the agent moves to a leaf node of the tree, evaluates the node with its neural network and then backfills the value of the node up through the tree.

The act method repeats the simulation multiple times to understand which move from the current position is most favourable. It then returns the chosen action to the game, to enact the move.

The replay method retrains the neural network, using memories from previous games.

This file contains the Residual_CNN class, which defines how to build an instance of the neural network.

It uses a condensed version of the neural network architecture in the AlphaGoZero paper — i.e. a convolutional layer, followed by many residual layers, then splitting into a value and policy head.

The depth and number of convolutional filters can be specified in the config file.

The Keras library is used to build the network, with a backend of Tensorflow.

To view individual convolutional filters and densely connected layers in the neural network, run the following inside the the run.ipynb notebook:


This contains the Node, Edge and MCTS classes, that constitute a Monte Carlo Search Tree.

The MCTS class contains the moveToLeaf and backFill methods previously mentioned, and instances of the Edge class store the statistics about each potential move.

This is where you set the key parameters that influence the algorithm.

Adjusting these variables will affect that running time, neural network accuracy and overall success of the algorithm. The above parameters produce a high quality Connect4 player, but take a long time to do so. To speed the algorithm up, try the following parameters instead.

Contains the playMatches and playMatchesBetweenVersions functions that play matches between two agents.

To play against your creation, run the following code (it’s also in the run.ipynb notebook)

from game import Game
from funcs import playMatchesBetweenVersions
import loggers as lg
env = Game()
, 1  # the run version number where the computer player is located
, -1 # the version number of the first player (-1 for human)
, 12 # the version number of the second player (-1 for human)
, 10 # how many games to play
, lg.logger_tourney # where to log the game to
, 0  # which player to go first - 0 for random

When you run the algorithm, all model and memory files are saved in the run folder, in the root directory.

To restart the algorithm from this checkpoint later, transfer the run folder to the run_archive folder, attaching a run number to the folder name. Then, enter the run number, model version number and memory version number into the file, corresponding to the location of the relevant files in the run_archive folder. Running the algorithm as usual will then start from this checkpoint.

An instance of the Memory class stores the memories of previous games, that the algorithm uses to retrain the neural network of the current_player.

This file contains a custom loss function, that masks predictions from illegal moves before passing to the cross entropy loss function.

The locations of the run and run_archive folders.

Log files are saved to the log folder inside the run folder.

To turn on logging, set the values of the logger_disabled variables to False inside this file.

Viewing the log files will help you to understand how the algorithm works and see inside its ‘mind’. For example, here is a sample from the logger.mcts file.

Equally from the logger.tourney file, you can see the probabilities attached to each move, during the evaluation phase:


Training over a couple of days produces the following chart of loss against mini-batch iteration number:

The top line is the error in the policy head (the cross entropy of the MCTS move probabilities, against the output from the neural network). The bottom line is the error in the value head (the mean squared error between the actual game value and the neural network predict of the value). The middle line is an average of the two.

Clearly, the neural network is getting better at predicting the value of each game state and the likely next moves. To show how this results in stronger and stronger play, I ran a league between 17 players, ranging from the 1st iteration of the neural network, up to the 49th. Each pairing played twice, with both players having a chance to play first.

Here are the final standings:

Clearly, the later versions of the neural network are superior to the earlier versions, winning most of their games. It also appears that the learning hasn’t yet saturated — with further training time, the players would continue to get stronger, learning more and more intricate strategies.

As an example, one clear strategy that the neural network has favoured over time is grabbing the centre column early. Observe the difference between the first version of the algorithm and say, the 30th version:

1st neural network version

30th neural network version

This is a good strategy as many lines require the centre column — claiming this early ensures your opponent cannot take advantage of this. This has been learnt by the neural network, without any human input.

Learning a different game

There is a file for a game called ‘Metasquares’ in the games folder. This involves placing X and O markers in a grid to try to form squares of different sizes. Larger squares score more points than smaller squares and the player with the most points when the grid is full wins.

If you switch the Connect4 file for the Metasquares file, the same algorithm will learn how to play Metasquares instead.


Hopefully you find this article useful — let me know in the comments below if you find any typos or have questions about anything in the codebase or article and I’ll get back to you as soon as possible.

If you would like to learn more about how our company, Applied Data Science develops innovative data science solutions for businesses, feel free to get in touch through our website or directly through LinkedIn.

Applied Data Science is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, let’s talk.

30s ad

*Originally published at *

Artificial Intelligence A-Z™: Learn How To Build An AI

Artificial Intelligence 2018: Build the Most Powerful AI

ChatBots: Messenger ChatBot with API.AI and Node.JS

ChatBots: Messenger ChatBot with API.AI and Node.JS

Intelligent Mobile Apps with Ionic and API.AI (DialogFlow)

#machine-learning #tensorflow

What is GEEK

Buddha Community

Build your own AlphaZero AI using Python and Keras
Ray  Patel

Ray Patel


Lambda, Map, Filter functions in python

Welcome to my Blog, In this article, we will learn python lambda function, Map function, and filter function.

Lambda function in python: Lambda is a one line anonymous function and lambda takes any number of arguments but can only have one expression and python lambda syntax is

Syntax: x = lambda arguments : expression

Now i will show you some python lambda function examples:

#python #anonymous function python #filter function in python #lambda #lambda python 3 #map python #python filter #python filter lambda #python lambda #python lambda examples #python map

Shardul Bhatt

Shardul Bhatt


Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.


Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development

How To Compare Tesla and Ford Company By Using Magic Methods in Python

Magic Methods are the special methods which gives us the ability to access built in syntactical features such as ‘<’, ‘>’, ‘==’, ‘+’ etc…

You must have worked with such methods without knowing them to be as magic methods. Magic methods can be identified with their names which start with __ and ends with __ like init, call, str etc. These methods are also called Dunder Methods, because of their name starting and ending with Double Underscore (Dunder).

Now there are a number of such special methods, which you might have come across too, in Python. We will just be taking an example of a few of them to understand how they work and how we can use them.

1. init

class AnyClass:
    def __init__():
        print("Init called on its own")
obj = AnyClass()

The first example is _init, _and as the name suggests, it is used for initializing objects. Init method is called on its own, ie. whenever an object is created for the class, the init method is called on its own.

The output of the above code will be given below. Note how we did not call the init method and it got invoked as we created an object for class AnyClass.

Init called on its own

2. add

Let’s move to some other example, add gives us the ability to access the built in syntax feature of the character +. Let’s see how,

class AnyClass:
    def __init__(self, var):
        self.some_var = var
    def __add__(self, other_obj):
        print("Calling the add method")
        return self.some_var + other_obj.some_var
obj1 = AnyClass(5)
obj2 = AnyClass(6)
obj1 + obj2

#python3 #python #python-programming #python-web-development #python-tutorials #python-top-story #python-tips #learn-python

Art  Lind

Art Lind


Python Tricks Every Developer Should Know

Python is awesome, it’s one of the easiest languages with simple and intuitive syntax but wait, have you ever thought that there might ways to write your python code simpler?

In this tutorial, you’re going to learn a variety of Python tricks that you can use to write your Python code in a more readable and efficient way like a pro.

Let’s get started

Swapping value in Python

Instead of creating a temporary variable to hold the value of the one while swapping, you can do this instead

>>> FirstName = "kalebu"
>>> LastName = "Jordan"
>>> FirstName, LastName = LastName, FirstName 
>>> print(FirstName, LastName)
('Jordan', 'kalebu')

#python #python-programming #python3 #python-tutorials #learn-python #python-tips #python-skills #python-development

Art  Lind

Art Lind


How to Remove all Duplicate Files on your Drive via Python

Today you’re going to learn how to use Python programming in a way that can ultimately save a lot of space on your drive by removing all the duplicates.


In many situations you may find yourself having duplicates files on your disk and but when it comes to tracking and checking them manually it can tedious.

Heres a solution

Instead of tracking throughout your disk to see if there is a duplicate, you can automate the process using coding, by writing a program to recursively track through the disk and remove all the found duplicates and that’s what this article is about.

But How do we do it?

If we were to read the whole file and then compare it to the rest of the files recursively through the given directory it will take a very long time, then how do we do it?

The answer is hashing, with hashing can generate a given string of letters and numbers which act as the identity of a given file and if we find any other file with the same identity we gonna delete it.

There’s a variety of hashing algorithms out there such as

  • md5
  • sha1
  • sha224, sha256, sha384 and sha512

#python-programming #python-tutorials #learn-python #python-project #python3 #python #python-skills #python-tips