One Step Ahead

A weakly-supervised approach to training robust machine learning models for transaction monitoring

More Info
expand_more

Abstract

In recent years financial fraud has seen substantial growth due to the advent of electronic financial services opening many doors for fraudsters. Consequently, the industry of fraud detection has seen a significant growth in scale, but moves slowly in comparison to the ever-changing nature of fraudulent behavior. As the monetary losses associated with financial fraud continue to grow, so does the need for efficient automated decision making systems. Simple decision making rules are often still the industry standard and only show decent results in the short-term, as reverse-engineering such rules is an easy task for smart fraudsters. Supervised learning systems as automated fraud detectors have shown promising results across the field, but are plagued by challenges uniquely prevalent in the field. Disproportional class imbalance in fraudulent transactions, as well as fraudsters continually adopting new schemes make training robust and generally applicable machine learning models an arduous task. This work introduces a novel machine learning pipeline, which makes use of carefully selected synthetic samples of this minority class to augment the training dataset of the supervised model. Synthetic samples representing fraudulent transactions are filtered based on a novel technique to quantify their expected performance as an adversarial example, using both data-driven and human-expert-driven techniques. By providing the supervised model with high-quality synthetic adversarial examples, we aim to improve its generalizability to never-seen-before fraudulent behavior and, in turn, improve its robustness to the volatile nature of financial fraud. Our results show that weakly-supervised models trained on our augmented datasets are able to detect 7% more fraudulent transactions compared to a baseline model trained on the standard dataset, at the cost of a 1% increase in false positives. Our calculations further show that applying this system could lead to a decrease of 1/6 in monetary losses incurred by financial fraud.