In this, the fourth part of our series on Multi-Armed Bandits, we’re going to take a look at the Upper Confidence Bound (UCB) algorithm that can be used to solve the bandit problem.

Creating an API server using Flask, making data analysis in Jupyter notebooks or creating a neural network using TensorFlow, these all can be easily written in a few lines of code. Any performance-critical parts can be optimized thanks to CPython's C API. Python is a very effective weapon in anyone's inventory.

Reinforcement Learning and 9 examples of what you can do with it. Reinforcement Learning is a subset of machine learning that enables an agent to learn through the consequences of actions in a specific environment.

In this article, we will learn about the taxonomy of Reinforcement Learning algorithms. We will not only learn about one taxonomy but several taxonomies from many different points of view.

I will tell you about one of them — how I taught a Reinforcement learning (RL) agent to play the famous puzzle game called “2048”. The article deliberately will not contain code, mathematics, state-of-the-art approaches and the latest discoveries in the field, so people who are well acquainted with RL will not discover anything new for themselves.

Supervised Learning vs Unsupervised Learning. This article will introduce us to the tools and techniques developed to make sense of unstructured data and discover hidden patterns.

Getting to know the concept of the Markov Decision Process in Reinforcement Learning. We will discuss how to formulate the RL problems. To do that, we need to understand the Markov Decision Process or well known as MDP.

Implementation of the MCTS algorithm which can handle decision making in the real-time game namely “HEX” in a limited amount of time.

A Gentle Overview of RL solutions, and how to categorize them. Important takeaways from the Bellman equation, in Plain English. This is the second article in my series on Reinforcement Learning (RL). Now that we understand what an RL Problem is, let’s look at the approaches used to solve it.

I used the knowledge and wisdom I gained from my work in Artificial Intelligence to understand and teach my two-year-old son more effectively and regain my sanity.

An intuitive explanation of what reinforcement learning is all about. First of all, I want to thank Jakarta Machine Learning and AWS for allowing me the opportunity to join the AWS DeepRacer boot camp.

Infer your Reward by Observing an Expert. Implement your first Inverse Reinforcement Learning algorithm

I used the knowledge and wisdom I gained from my work in Artificial Intelligence to understand and teach my two-year-old son more effectively and regain my sanity. I will also give a brief overview of relevant AI projects to help illustrate my analogy.

We’ve learned about some important terms and concepts in Reinforcement Learning (RL). We’ve also learned how RL works at a high-level. Before we dive deeper into the theory behind RL, I want to explain more about RL based on its SUPER cool application, AWS DeepRacer.

Reinforcement Learning Made Simple: Intro to Basic Concepts and Terminology. I’ll introduce many of the fundamental concepts and terminology of RL, so that we can build solutions using them in the following articles.

We’ll cover all the nuts and bolts of the Bandit problem, defining the terminology and basic equations that will be used in subsequent parts. Most of this is also directly applicable to reinforcement learning in general.

In this article, we will be implementing Deep Deterministic Policy Gradient and Twin Delayed Deep Deterministic Policy Gradient methods with TensorFlow 2.x. We won’t be going deeper into theory and will cover only essential things.

Reinforcement Learning V.S Supervised Learning in Financial Markets. My opinion on why Reinforcement Learning is superior to Supervised Learning when it comes to Financial Markets.

Today, we’re going to start exploring new environments, using a very powerful tool, the Markov Decision Processes (MDPs). We will develop a strong foundation for describing environments, as well as establish some terminology that will allow us to go further into RL.

Create a Python-like Environment and Agent with Kotlin. This article assumes that the readers are familiar with terms like state, action, episode, and reward in the context of reinforcement learning. Basic knowledge of Q-learning would be helpful.