This paper is a good starting point for both learners and practitioners in the fields of algorithmic trading, portfolio management, order execution, market making, and even reinforcement learning (RL), as they launch their deep dives into the interactions of those fields. Tables 3, 4, 5, and 6 summarize significant results published over the last two decades. Those tables include extensive information about RL algos, data sources, asset types, markets, and data frequencies.
RL using approximation methods is a tool that can take advantage of deep learning when the environment is not differentiable. The financial market is a perfect domain to test whether RL agents can effectively automate the tasks, and results can be extended to other areas. Table 3 shows many top-level choices of available RL algos and related publications. Even though the tables (including Table 4 summarizing RL in the field of algo-trading) do not include any performance information, Section 5.2 includes some. However, wording like “great performance” could be improved by quantitative information. Section 5.3 (with Table 5) describes RL results in portfolio management, including the recent development of transformer-based RL (relation-aware transformers). Institutional traders will find Sections 5.4 (with Table 6) and 5.5 (with Table 7) interesting for their descriptions of order execution and market making using RL, including their limitations in real life. Section 6 provides ideas for future research and addresses the need for baselines so that performance can be compared.
Section 1 claims that RL is an emerging field in ML. This could be true for deep-learning-based RL; however, Sutton and Barto’s important book on RL was published back in 1998 [1]. Before that, RL grew under the umbrella of adaptive control. Section 2.2 has a minor typographical error in its definition of a short position: “buys” and “sells” are interchanged.