Using adaptive machine learning systems to detect, predict, and prevent fraud in a hyper-connected banking environment
Synopsis
In today’s financial system, fraud is a leading and serious problem. Fraudulent behavior can significantly harm financial institutions and is difficult to differentiate from legitimate transactions. The increasing amount of data allows the establishment of different models to detect fraud in banking. Various approaches have been proposed to tackle this issue, and several factors have drawn increasing attention towards the detection of fraud in financial institutions. Recent findings show that the use of new accounts is the main point for such actions, as they allow implementing borrowing and withdrawing behaviors, which are unfavorable for the banking system (Balasubramaniam et al., 2024; De Luzi, 2024; Jones & Tyson, 2025). Although there have been many studies on bank fraud detection, still there are many things required to be put in place. Once a pattern for a certain kind of fraud is established, some banking institutions implement the same pattern for a long time, which reduces its detection rate. Another problem is the class imbalance; there are only a few fraud actions compared to the huge amount of transactions with no fraud. When it comes to equipping a bank or other financial systems with an appropriate tool to automatically detect fraud actions, a static solution is not optimal: it would solve the problem in small periods of time and not in the long timescales, hence leading to non-optimal decisions. However, the rational behavior of fraudsters goes beyond the bank’s fraud space, which makes it quite difficult to build static solutions. An ideal system would dynamically learn the behavioral space of the fraudsters: a transparent and self-adaptive model fitting for any kind of potential patterns that has high predictive performance on the data available, boosting the clear exposure of discrepancies for the fraud actions. The problem seems to be time-evolving in terms of both the data showing changes over time and the predictable solution evolving accordingly. This clearly indicates a solution that could quickly deal with abrupt changes of the data moving away from its predefined distribution