Buy-side quants of the year: Matthew Dixon and Igor Halperin

Risk Awards 2022: New machine learning tool tackles an age-old, old age problem

Left: Matthew Dixon, Right: Igor Halperin
Left: Matthew Dixon, Right: Igor Halperin
Nancy Wong/Alex Towle

For many years, financial institutions have been using machine learning and data to solve complex problems in derivatives hedging, investment and risk management. Yet of all the dilemmas they face, one of the most complex is also one of the most familiar: how should you invest for retirement, and how quickly should you spend your savings once you get there?

Matthew Dixon, assistant professor at the Illinois Institute of Technology in Chicago, says the quant finance industry is “very much at the infancy” when it comes to using machine learning to address challenges such as life cycle planning, tax optimisation, estate planning or perpetual annuities. However, in the past year he and Igor Halperin, a vice-president at Fidelity Investments’ AI Asset Management Center of Excellence, have shown how these complex mathematical tools can be used in retirement planning.

In a research paper published in Risk.net in July 2021, they set out how reinforcement learning and inverse reinforcement learning could be applied to goal-based wealth management. For their contribution to these largely unsolved challenges, Dixon and Halperin are Risk.net’s 2022 buy-side quants of the year.

Dixon and Halperin are among many in the industry who think the solutions to wealth management problems such as retirement planning are to be found in reinforcement learning. This branch of machine learning hit the headlines in 2016, when Google’s DeepMind used it to defeat the world’s top Go player in what was widely seen as a breakthrough for artificial intelligence. In reinforcement learning, an agent tries to optimise its reward function over time, and the approach is suited to situations that require sequential decisions to achieve given objectives. It is relatively new to finance, though it has shown potential in this field.

Even with the most sophisticated mathematical tools, it is hard to work out an optimal approach to financial planning for retirement. The problem plays out over long, and sometimes uncertain, time horizons and there are multiple factors to consider: tax rules vary, individuals intend to retire at different ages, and so on.

In essence, we use ideas from statistical mechanics, and try to penalise the loss function with an information cost in order to suppress the effect of noise

Matthew Dixon, Illinois Institute of Technology

Gordon Ritter, founder and chief information officer at Ritter Alpha and the recipient of the buy-side quant of the year award in 2019, says: “When you include variables like different tax regimes within different states and countries or time constraints, it’s really hard to handle the problem in a fully analytic way, so that is where reinforcement learning can shine.”

Dixon and Halperin set out to tackle the issue of retirement planning by applying “G-learning” – a probabilistic extension of a common reinforcement-learning algorithm known as Q-learning. G-learning is designed to work with “noisy” data, such as financial data. It is also stable because it is guaranteed to converge and produce a unique result. “In essence, we use ideas from statistical mechanics, and try to penalise the loss function with an information cost in order to suppress the effect of noise,” says Dixon.

Based on his and Halperin’s work, Fidelity Investments is developing two products: AI Alter Ego, an application for asset management; and AI Planner, an application for wealth management and retirement planning. The products are still at the research stage, but Fidelity has filed patents for both.

Dixon and Halperin began working together after being introduced by Kay Giesecke, a professor at Stanford University in California. While collaborating on a book, Machine Learning in Finance, with Paul Bilokon, the pair decided to expand the research on reinforcement and inverse reinforcement learning that they had put into the publication to tackle the wealth management problem.

Inverse reinforcement learning, as its name suggests, takes the inverse direction of computation. It uses the output of a strategy – that is, the allocation of instruments through time – and infers the underlying strategy, providing the parameters that enable it to be replicated. The approach already has numerous applications in robotics and gaming.

(Not so) random walk

This year’s award winners had rather different career paths. Dixon started as a software engineer before moving to Lehman Brothers’ structured credit team and then taking his PhD at Imperial College London. He has since pursued a career in academia while freelancing as a data scientist in Silicon Valley and consulting for private equity firms. In 2015, after joining the Illinois Institute of Technology, he co-wrote what is regarded as the first deep learning paper in finance: a project, financed by Intel, on the backtesting of trading strategies using deep learning signals.

Halperin’s background is in physics. He switched to finance in 1999 after some introductory books captured his attention, but the real spark was when he encountered econophysics. “I came across the papers by Jean-Philippe Bouchaud, Eugene Stanley and other econophysicists,” he recalls. “I was inspired by them and decided that that was what I wanted to do.”

He spent more than a decade at JP Morgan in New York, where he developed parametric models for credit, commodities and portfolio risk optimisation. It was when he became convinced those models did not provide the right answers to the quant finance problems he was working on that he decided to dedicate his efforts to data-driven solutions. He believes reinforcement learning can solve most problems in finance, from wealth management to optimal execution and even option pricing.

It is a probabilistic approach. It provides not just the point estimates for the optimal allocation, but also the uncertainty around them. So, it gives information on how much one can trust the recommendations

Igor Halperin, Fidelity Investments

The retirement planning problem that he and Dixon worked on concerns investment decisions over long periods, with target dates in mind and with constrained, but presumably growing, amounts of investible capital.

Compared with other popular techniques – such as deep Q-learning, an approach in which the reward function of the Q-learner is estimated via neural networks – G-learning is very fast. In their paper, Dixon and Halperin show an example on a portfolio with 100 instruments; the calibration of the G-learning algorithm takes about 30 seconds on a standard laptop. Leaving aside the memory requirements, the time needed to compute the strategy grows in an approximately linear manner with the size of the portfolio. Handling a higher-dimension dataset, like the S&P 500, will not take orders of magnitude longer than handling a 100-instrument portfolio.

Q-learning, by contrast, would struggle to match the results achieved through G-learning. It deals with discrete actions and is less well-suited to fit the continuous nature of financial problems; it requires a lot of parameters to calibrate, making the exercise computationally expensive; and its output is a deterministic strategy that ignores noise and estimation uncertainty.

The two quants’ main innovation is generative inverse reinforcement learning, or Girl. The G-learner they proposed produces a portfolio allocation and a function that describes the optimal level of consumption to solve the wealth management problem.

“It is a probabilistic approach,” Halperin says. “It provides not just the point estimates for the optimal allocation, but also the uncertainty around them. So, it gives information on how much one can trust the recommendations.

“It learns from the collective intelligence of portfolio managers who pursue similar strategies and have similar benchmarks. One can then analyse this data together to hopefully improve on it.”

If combined, the two techniques could form the backbone of applications in robo-advisory. Inverse reinforcement learning could learn from the investment strategies of star professional investors, and reinforcement learning would replicate them for clients. As Halperin puts it, it’s like a student who observes and learns strategies from their teacher. In subsequent research, he has worked on enabling the algorithm to improve on what it has learned from its own “teacher”.

The approach is not designed to be fully autonomous, though. “It combines human and artificial intelligence,” Halperin says. “Portfolio managers do the stock picking. And the task of our tool, once it’s shown the investible universe, is to recommend the optimal size of the positions.”

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@risk.net to find out more.

Best execution product of the year: Tradefeedr

Tradefeedr won Best execution product of the year for its API platform, which standardises and streamlines FX trading data, enabling better performance analysis and collaboration across financial institutions

Most read articles loading...

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here