We should not be surprised that artificial intelligence (AI) generally outperforms stock analysts in earnings forecasts, or that rule-based strategies provide better financial advice on average than a personal banker. Even before the recent advances in generative AI, there were proven benefits to systematic investing. While such techniques may not find the rare top stocks or market turning points that bring above-average returns, they still have proven value.
AI developments, however, show that we can go beyond rule-based recommendations. Macroeconomics, accounting, and statistics are the three pillars of investment. Large language models achieve top scores in advanced exams of these subjects. Additionally, we know that LLMs can summarize far more context and collective wisdom than a human, which can be very helpful for macroeconomic strategies. So, if AI can help with financial decisions, why do analysts or portfolio managers find it so difficult to accept this change?
Some insights can be found in the work of data scientist César Hidalgo on how people judge machines. When we use a program, we focus on the tool's performance. Every prediction error of this program will cause our financial expert to lose confidence. In most cases, it does not matter if the algorithm is, on average, better than the human. Our financial advisor will rely on their intuition and experience.
Hidalgo's research shows that we judge human advice differently. We look beyond performance and consider the intentions of the person advising us. When we engage with a private banker or entrust our money to a fund manager, we assume an alignment with our goals, especially if the contract includes performance-based fees. When we incorporate these intentions into our mental equation, we are more tolerant of poor returns.
Human advice can therefore fail more often and still be considered valuable, especially if there is a story that explains the outcome. In Hidalgo's words, we expect rationality from machines and humanity from humans.
We also resist absorbing information that contradicts our experience. In experiments with radiologists using AI, it was unclear how they incorporated the algorithm's views into their predictions. The work took longer, and the effectiveness of the combined diagnostics was questionable.
If that's true for radiologists, it must be even more difficult for anyone working in the financial markets. The macro strategy could be the most challenging area to integrate AI. Firstly, because the market, like the weather, is not stationary, which means it will never react exactly the same way to, for instance, inflation or employment data, let alone to a possible return of Donald Trump to the White House. Furthermore, every strategist has strong initial beliefs – or an "identity" as always optimistic or pessimistic – that influence their judgment. It is very hard to escape the narratives that clients expect from them.
Finally, we long for control. There is a radical difference between a model built with a spreadsheet from available data and something like ChatGPT. Based on our experience and intuition, we decide on the form and components of the former, but not the latter. And in most cases, we don't even know how the LLM arrived at a specific answer. Therefore, it is understandable that our financial advisor feels uncomfortable using a prediction that is not their own.
There are some considerations to keep in mind. We should allow people to adjust certain parameters of the model. In other words, we need to allow professionals to accept the AI recommendations as if they were their own. Ideally, the model could be improved if the expert adds context that may not be accessible to the model. This could include private circumstances of the client or other hard-to-quantify factors and constraints. Alternatively, we could accept a performance drop if it leads to more people accepting the insights due to the human touch. This could be a reasonable compromise in areas such as wealth management consulting.
Finally, we must strive to make AI more understandable. This is a legitimate expectation as the demands for scrutiny and compliance grow. Furthermore, some of the leading models are integrating "Chain of Thought" logic, which encodes expert knowledge into a raw model. This way, we not only see performance improvements but also have some rules that most experts can trust. No one wants to look like a dumb robot just repeating the advice of a black box. Trust and judgment are crucial features in a customer relationship. In the end, we expect people to remain human.