Fraud detection is a cornerstone of modern e-commerce, yet it is also one of the least publicized domains in Machine Learning. That’s for a good reason: it’s an adversarial domain, where fraudsters constantly invent new ways to bypass existing models, and model developers constantly invent new ways to catch them.
The goal of fraud detection systems is to block fraudulent transactions, such as those placed by fake accounts using stolen credit cards, while at the same time preventing any friction to the shopping experience of genuine customers. False negatives (fraud transactions that mistakenly went through the system) result in monetary loss also known as ‘bad debt’ due to chargebacks initiated by the actual credit card owners, while false positives (genuine transactions that were blocked) result in poor customer experience and churn.
Consider that a modern e-commerce provider may process somewhere in the order of tens of Millions of orders per day, and that fraud rates are at the sub-percent level, and you’re starting to see why this is a challenging domain. It’s the ultimate needle-in-a-haystack problem, where the haystacks are overwhelmingly large and…