Option Pricing Using Reinforcement Learning

This post demonstrates how to use reinforcement learning to price an American Option. An option is a derivative contract that gives its owner the right but not the obligation to buy or sell an underlying asset. Unlike its European-style counterpart, American-style option may exercise at any time before expiry.

American Option is known to be an optimal control MDP (Markov Decision Process) problem where the underlying process is Geometric Brownian motion ([1]). The Markovian state is a price-time tuple and the control is a binary action that decides on each day whether to exercise the option or not.

The optimal stopping policy looks like the figure below, where the x-axis is time and the y-axis is the stock price. The curve in red is commonly called the optimal exercise boundary. On each day, if the stock price falls in the exercise region that is located above the boundary for a call or below the boundary for a put, it is optimal to exercise the option and get paid by the amount between stock price and strike price.

Optimal Exercise Boundary

One can imagine it as a discretized Q-table as illustrated in dotted grids. Every day the agent or the trader looks up the table and take action according to today’s’ price. The Q-table is monotonous in that all the grids above the boundary yield a go-decision and all the grids below yield a no-go decision. Therefore Q-learning suits well to find the optimal strategy that is defined by this boundary.

The remainder contains three sections. In the first section, a baseline price is computed using classical models. In the second section, an OpenAI gym environment is constructed, similar to building an Atari game. and then in the third section, an agent is trained with DQN (Deep Q-Network) to play American options, similar to training computers to play Atari games. The full Python notebook is located here on Github.

Section One — Baseline

There are many ways to price an American option, from for example binomial tree to Longstaff-Schwartz Monte Carlo methods. Here I use QuantLib package to price a one-year American put option.

#reinforcement-learning #options #tensorflow #option-pricing #deep-learning

Section One — Baseline

medium.com

Option Pricing Using Reinforcement Learning