An A.I. Capital Management Research Article Series. Machine learning and artificial intelligence is becoming ubiquitous in quantitative trading.
*Abstract: *Machine learning and artificial intelligence is becoming ubiquitous in quantitative trading. Utilizing deep learning models in a fund or trading firm’s day to day operations is no longer just a concept. Compared to the more well-known and historied supervised and unsupervised learning algorithms, Reinforcement Learning (RL) seems to be a new kid on the block but with astonishing track records, solving problems after problems in the game space (AlphaGo, OpenAI Five etc.), gradually making its way to the trading world, and with a growing number of A.I. experts believe it to be the future of AGI (Artifical General Intelligence). This article serves as one of the many articles, about how to develop and deploy Reinforcement Learning trading algorithms, as well as the advantages and challenges of Deep RL in trading environments.
*Author: *Marshall Chang is the founder and CIO of A.I. Capital Management, a quantitative trading firm that is built on Deep Reinforcement Learning’s end-to-end application to momentum and market neutral trading strategies. The company primarily trades the Foreign Exchange markets in mid-to-high frequencies.
How is Reinforcement Learning different from un/supervised learning?
This opening article will talk about how reinforcement learning works in comparison with un/supervised learning. The goal is to explain RL in a theoretical way, using layman’s terms and examples in trading. The target audience will be practitioners and quant researchers with good knowledge of machine learning, but also traders without computer science background but understand the market, risk/reward and the business of trading well. For practitioners who want to learn RL systematically, I recommend David Silver’s UCL course on youtube, as well as the Sutton & Barto Book: Reinforcement Learning.
Fundamentally, the task of machine learning is to map a pertinent relationship between 2 data sets, using a function/model, which can be as simple as a linear regression with 1 variable, or as complex as a deep neural network with millions of parameters. In the world of trading, naturally we want to find any generalizable relationship between a X dataset to Y target, which is future price movement, no matter how close or later into the future.
For supervised and unsupervised learning approaches, the 2 datasets are prepared before we train the model, or in other words, they are static. Hence no matter how complicated the relationship the model finds, it’s a static relationship in that it represents a preset dataset. Although we have significant knowledge and experience on training and validating un/supervised deep models, this static relationship is rarely the case in the financial world. Not to mention that training models like neural networks is a global optimization process, meaning the data 10 years ago and yesterday will have equally importance for the “A.I.” model in the time series, even though what really matters is next month’s performance.
At the get go, RL is different from un/supervised learning because its model is trained on a dynamic dataset to find a dynamic policy, instead of a static dataset to find a relationship. To understand how this works, we need to understand how RL is designed to be an agent-base problem in an environment. The model is represented by an agent, who by design, observes the environment states, interacts with the environment through actions, and receives feedback in the form of rewards and state transitions (where we end up if we do this action now).
From AICM presentation slides
The RL model’s training data X are the experiences encountered by the agent in the form of [observation/state, action], while target data Y are the resulting reward/punishment of such action under the circumstances, in the form of [reward, next observation/state]. On a higher level, the agent is trained on its experiences to learn the best set of actions to interact with the environment, in order to get the most reward.
Deep Q-Networks have revolutionized the field of Deep Reinforcement Learning, but the technical prerequisites for easy experimentation have barred newcomers until now.
Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.
This paper presents a deep reinforcement learning model that learns control policies directly from high-dimensional sensory inputs.
Designing user experiences is a difficult art. Compared to other applications, video games provide designers a huge canvas to work with.
Looking to attend an AI event or two this year? Below ... Here are the top 22 machine learning conferences in 2020: ... Start Date: June 10th, 2020 ... Join more than 400 other data-heads in 2020 and propel your career forward. ... They feature 30+ data science sessions crafted to bring specialists in different ...