Machine learning could solve optimal execution problem
Reinforcement learning can be used to optimally execute order flows
Humans and models are usually not good at handling too much information. When the number of factors involved in a decision-making process increases, the modelling of the decision process and its various outcomes becomes unwieldy and time-consuming.
In recent years, machine learning has stepped in to solve that problem.
One area within finance that has consistently attracted a large amount of research from the buy side is market impact, or the effect of large orders on market price.
If a trade cannot be executed in one go, because of a lack of liquidity at the prevailing market price, it is broken into a series of smaller trades. But this exposes the trader to the risk of the market moving against them while those trades are being executed.
Solutions range from a simple limit on time taken to execute the trade to limits on the price at which the trade is executed. When either limit is breached, the firm stops trading – but these two methods need not always result in the optimal execution that maximises the wealth and reduces costs for the trader. More sophisticated institutions use dynamic programming to update the execution algorithm to reflect changing market conditions. This means having to use computationally intensive numerical techniques to find the optimal trading strategy.
In all these cases, the limitations arise from the fact the various dynamics need to be modelled. But what if there was no need for a model?
In a recent technical article, Machine learning for trading, Gordon Ritter, a senior portfolio manager at GSA Capital Partners in New York, applies a machine learning technique called reinforcement learning to simulate market impact and find an optimal trading strategy that maximises the value of the trade adjusted for its risk.
Common machine learning techniques include cluster analysis, which is used to identify hard-to-see similarities and patterns in complex data. In supervised learning techniques, which include Bayesian regression and random forests, the agent can learn from example data and associated target responses.
The agent is learning about the optimal strategy and the cost without actually building a model. In the training process, the agent tries all sorts of things and gets to observe the reward and basically correct his algorithm
Gordon Ritter, GSA Capital Partners
Another technique is reinforcement learning, which tries to train the machine, through a large number of simulations, to choose the best course of action in a particular environment, so when the machine is ready to trade in real life, it already knows what the optimal course of action is, based on its training.
In this paper, Ritter applies reinforcement learning to trading by giving the agent the task of maximising the expected utility of the trade – that is, the value of trade less all associated costs, and adjusted for the risk of the trade. “It allows you to learn the optimal strategy in a way that is fully cognisant of any kind of cost, so your own impact on the price is a really big source of cost for quant traders. But other kinds of costs, such as bid-offer spreads, commissions, borrowing costs – all those would get factored into the reward,” says Ritter.
What has always restricted traditional optimal execution algorithms is the number of factors that can be used in the models. The larger this number gets, the more difficult the problem is to solve. This does not exist with reinforcement learning, as the machine learns by the trial and error associated with being in different states of the world and figuring out the optimal path of execution on its own. “The agent is learning about the optimal strategy and the cost without actually building a model. In the training process, the agent tries all sorts of things and gets to observe the reward and basically correct his algorithm,” says Ritter.
The author says millions of scenarios can be run during the training process in less than a second. The only caveat is that the approach in the paper applies to single asset trading, such as a single stock. If multiple assets are involved, the training process would be slower, but once training is complete, the technique can be used in real time to trade.
Reaping the benefits
Many have been quick to try and reap the benefits of machine learning in the modelling of market impact. Firms such as Portware and JP Morgan are already using supervised machine learning approaches to model market impact. The latter is also testing the use of reinforcement learning to optimise its trading schedule.
It has also found its way into many other applications such as model validation, improving the pitching of trade ideas and in credit underwriting.
One common criticism of machine learning, especially from regulators, is the way it works is not transparent, so when things go wrong, it is difficult to pinpoint to the source of the problem. A related concern is whether the machine learning technique itself is a model and hence should be backtested – but it is not very clear how to do that.
For that reason, while developing machine learning applications for trading activities that could potentially affect markets, simultaneous strides must be made in improving the way machine learning approaches can be tested. That way, firms can leverage two things – the diligence and speed with which machines can trawl through large datasets, and the ability of humans to adapt and find solutions when things go wrong.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net
More on Our take
Beware the macro elephant that could stomp on stocks
Macro risks have the potential to shake equities more than investors might be anticipating
Podcast: Piterbarg and Nowaczyk on running better backtests
Quants discuss new way to extract independent samples from correlated datasets
Should trend followers lower their horizons?
August’s volatility blip benefited hedge funds that use short-term trend signals
Low FX vol regime fuels exotics expansion
Interest is growing in the products as a way to squeeze juice out of a flat market
Can pod shops channel ‘organisational alpha’?
The tension between a firm and its managers can drag on returns. So far, there’s no perfect fix
CDS market revamp aims to fix the (de)faults
Proposed makeover for determinations committees tackles concerns over conflicts of interest
BofA quants propose new model for when to hold, when to sell
Closed-form formula helps market-makers optimise exit strategies
Are regulators wrong to think of AT1s as debt?
Bank capital bonds face criticism. One answer might be to treat them as ‘fixed-income equity’