Detecting Payment Card Fraud with Machine Learning. H2O Driverless AI + Kaggle Dataset

Payment card fraud affects everyone. Almost 30 billion dollars were lost worldwide in card fraud and identity theft only in 2019. Although financial institutions are locked in an escalating arms race against cybercriminals and scammers, losses still have to be accounted for. Consumers end up paying for money lost to fraud out of pocket, in the form of vendor and transaction fees. While corporations and governments spend more billions investigating and handling fraud cases.

Modern fraud prevention is expensive. Digital ID checks cost around $2 per document, companies spend millions on KYC and AML, and still, the number of fraudulent transactions is growing. Banks have been relying on passive measures to counteract fraud based on past breaches or fraud behaviour history, and only some have invested in pro-active or predictive fraud prevention.

Card-based payment systems worldwide generated gross fraud losses of $28.65 billion in 2019, amounting to 6.8¢ for every $100 of total volume. Source: Nilson Report

To understand what financial institutions can do to improve their fraud prevention efforts, we need to examine the current protection mechanisms. Payment cards that hold a set of credentials or cardholder data act as keys to a customer’s bank account and enable two types of transactions: card-present and card-not-present.

API-driven neobank software

Scalable platform to build a digital retail bank on top

More details

Payment card fraud basics

Card-present is when a payment card is physically used to make purchases or withdraw money from ATMs by entering a PIN. For decades, scammers have been using cameras, sensors, ATM skimmers, and other devices to make copies of cards and extract PINs.

In one case, a waiter was discovered using a portable magnetic stripe reader in his shoe to copy customers’ cards while walking to the register. In another scam, criminals used NFC readers to steal small amounts from people’s cards on the subway. By coming close to pockets and bags, they were able to charge cards without people noticing.

Phone confirmations for larger amounts and RFID blocking wallets can partially counter card-present fraud. Card-not-present transactions are more complex as they happen remotely, where a cardholder does not present a card to a merchant in person. CVV code on the back of the card is most often used to confirm that the person paying has physical access to the payment card, but 2FA methods via SMS OTP (3DSecure) and in-app authentications are becoming more widespread. Read this article to get more information about credit card fraud detection.

Scammers can intercept OTPs, consumer sessions, cardholder data (PAN, EXP, NAME, CVV), and even steal app credentials. The Lazarus Group from North Korea is notorious for using military-grade cyber expertise to steal money using man-in-the-middle software and cloned credit cards to withdraw cash from ATMs.

Banking: old vs. new

PSD2, the revised European Payment Service Directive that covers the whole of the EU brought into law in 2018, aimed to fix the lack of an Open banking regulatory environment, improve security, and protect customers, among other goals. Before banks started to adopt OpenAPI, companies had to go through hell to integrate with banks using ancient file exchange systems.

PSD2 standardized how payment and financial institutions interact with each other and with third-party providers. The directive enabled AISPs (Account Information Service Providers) to access information from multiple financial institutions with a customer’s permission. AISP services, for example, can aggregate data from different accounts in different banks and show it to a consumer in one place or application.

Mobile Wallet Solution

Create your digital product on top pre-developed software

More info

PISPs (Payment Initiation Service Providers) can go a step further and make payments on behalf of consumers. PISPs can pay incoming utilities, internet, and service bills that a consumer receives automatically. Although AISPs and PISPs are still in the early stages of development and adoption, similar initiatives are already being implemented worldwide.

In 5-10 years, OpenAPI initiatives will reach their potential and unlock digital banking’s benefits. Truly interactive banking experiences are great, but these changes open the industry to completely new attack vectors that need to be accounted for and prepared for.

Spotting fraudulent transactions using AI & ML

In one of the great weekly newsletters from deeplearning.ai, Andrew Ng, a leading AI expert, mentioned that financial anti-fraud systems broke because consumers changed their behaviour with the global pandemic’s arrival. Models used to predict consumer behaviour, supply, and demand had to be retaught to account for new patterns and spikes.

Let’s assume that we are a financial institution and that a customer’s payment card was compromised during the pandemic. What can we do to spot fraudulent transactions early on? We can take a data set, mark confirmed fraudulent transactions with a chargeback or other documented problem, and analyze it to determine correlations.

For most areas, obtaining a comprehensive dataset is not a problem. However, privacy laws protect banking and transaction data from being disclosed. GDPR in the EU provides customers with the right to be forgotten, and Big Tech companies are already being sued for billions for breaching the privacy laws. In terms of machine learning, if a consumer asks for his data to be deleted, does the request apply to the results of calculations based on their data? How far the law reaches will be discussed for years to come.

Affordable Neobanking Cloud-Based Software

Develop your own neobank faster with SDK.finance FinTech Platform

Learn more

Raw data

As a result, there are very few datasets with real customer data in the public domain. I used a relatively large 150 MB dataset from Kaggle with hundreds of thousands of anonymized transactions from European credit card users recorded in 2013. Locating useful information in a raw dataset is a very resource-intensive task that usually requires multiple data scientists and analysts.

I tried to approach the situation as a technology executive with a heavy managerial workload who can’t spare a couple of weeks to clean, spot anomalies, and balance the dataset. I decided to test a relatively new AutoML approach that could take on all of the routine and repetitive tasks that come with in-depth data analysis and extract insights from raw data.

There are many AutoML solutions to choose from today. Giant AWS SageMaker, Google AutoML, AutoAI with IBM Watson Studio, Microsoft Azure ML, and Oracle AutoML are complemented by smaller, but not less interesting DataRobot, Auto Weka, AutoML-Freiburg-Hannover, and H2O Driverless AI.

I chose the latter, H2O Driverless AI, simply because I could run all experiments on a local server or even a laptop instead of relying on the cloud. Whenever financial data is involved, most regulators restrict its movement to prevent data transfers outside the country or into the cloud. Another important point is that cloud-based solutions are often limited by the amount of processable data. Some products restrict tables to one million rows and add other restrictions to encourage users to purchase expensive enterprise-level licenses.

Even though H2O is a commercial project, there’s also a free version that does not have a handy GUI, but that should not be a problem for skilled hands. I used H2O Driverless AI with an educational license because I am in the process of getting a Ph.D.

The dataset itself was a CSVfile with only a few readable variables: time, amount, and class – whether a transaction was fraudulent or not. The rest were anonymized to protect the privacy of consumers. This makes it more interesting as we can observe how the system will behave with many unknown variables.

Raw credit card fraud detection dataset. Source: Kaggle

Setting up AutoML in H2O Driverless AI

I ran the experiments on IBM System X 3300 M Server with 12 Cores, 32 GB RAM, and Ubuntu Linux 18.04 LTS. It’s an old workhorse without a GPU but it provides a clearer picture of the performance. After importing the dataset into H2O, the system automatically analyzed the type and structure of data and suggested the best preliminary models, classificators, and analysis tools based on what’s inside the dataset. In my case, the dataset was highly unbalanced, so H2O recommended the Log Loss scorer.

Immediately after importing the dataset, H2O quickly showed the problem and unbalanced areas. After confirming a wide variety of settings, the system began to analyze the data. GUI showed preliminary results during the process, which could be explored and changed before full analysis is completed. Overall, it took about five days to process the data on my setup.

After completing the experiment, H2O offered a choice of models ready to be deployed on the cloud, servers, or data centres. This enables almost seamless continuous delivery or delivery after pressing a single button. Both options are very beneficial because updating such systems is a complex process that requires specialist skills.

Then, I chose to interpret the model. In other words, what did H2O find in this model while processing it by itself? In another day and a half, the system returned results that showed influence, dependence, importance, or weight of different variables in the dataset.

H2O demonstrated the importance of variable V14 that we should and need to examine further. The rest of the results consisted of other synthesized cluster functions. Using these results, we can go through each function separately and analyze whether it’s essential or not.

API-driven neobank software

Scalable platform to build a digital retail bank on top

More details

We can see how the system taught itself, made decisions, approached maximum results, and where it was incorrect. H2O shows different patterns and possible interpretations of different values and how the system sees relationships between other variables and results in the dataset.

To check our hypotheses, we ran two additional experiments where we excluded fields that came up in the first experiment to see and ensure whether we will get the same results without them. Overall, the new experiments were successful. Sometimes, there were differences in the variables’ influence, and in other cases, H2O synthesized new functions.

AutoML results

In the end, the true positivity rate of the prediction was 0.9733, which is an excellent result. We found that variables 12, 14, and 17 peaked, showing a possible relationship between time, amount, and some merchant attributes, but unfortunately, we won’t know for sure. H2O then visualized the interpretation that can be used by an executive like me, a data scientist, and other relevant people.

There are various visualization tools available that can show how variables correlate, a group in heat maps, and where outliers lie.

Perhaps one of the best parts is that H2O auto generates a “Word” .doc file as a report with all of its findings and the lifecycle of the analysis that I can print out and read whenever. It shows everything the system did, methods, how long it took, how effective, shifts, and importance. It would have taken me at least a full week to document the same process in a report if I did it manually.

A key takeaway from this is whether, instead of delegating complex tasks to teams of developers, engineers, and data scientists, it’s worth exploring and demonstrating the capabilities of existing tools and software first. H2O can deliver incredible results without typing a single letter in the command line.

I can turn the model H2O created into a java or python application that will generate a set of APIs, import a raw dataset with transaction variables, and the system will show whether the transaction is fraudulent or not, and how sure it is in that decision. I can then decide whether to allow the transaction to be processed or stop it immediately.

These tools can be used by top management, CTOs, and even marketing departments to generate valuable insights into business operations:

When’s the best time to push notifications about a new product?
Is it when consumers make the most transactions or the opposite of that?

AutoML can help find answers and improve business decision making through data analysis.

The solution is not a magic bullet against all fraudulent transactions or the only right method to roll out such a model. The experiment has provided me with enough information about what’s inside AutoML, what I can work with, and what else I can explore. However, it is reassuring that a research team that spent six months working on the same dataset reached the same conclusions as I did in around a week.

New attack vectors

Payment card fraud is limited by card expiry dates, limits, and security notifications. The method explained above can help find and stop fraudulent transactions made by perpetrators, but what if customers unwittingly transfer money by themselves?

In 2019, the executive of a UK-based energy firm thought he was speaking on the phone with his boss, the CEO of the firm’s German parent company, who asked him to send €220,000 to a Hungarian supplier. The caller said the request was urgent, directing the executive to pay within an hour, which he did. Instead of his boss, the executive spoke to a voice recording generated by artificial intelligence-based software that successfully impersonated the CEO.

Live facial biometric data many digital-only banks rely on to authenticate their customers is not fraud-proof either. Cybercriminals found a way to recreate 3D models of faces using recorded videos that can be used to log in by generating head tilts and turns on demand.

Mobile Wallet Solution

Create your digital product on top pre-developed software

More info

Social engineering plays a significant role in modern fraud cases. A man behind an Instagram account with 2.5 million followers www.instagram.com/hushpuppi, flaunting his opulent lifestyle told people they could earn as much as him by sending him money. He was arrested after stealing over 400 million from individuals and businesses worldwide.

There are many examples of money flippers on social media that promise to turn your $100 into $1000, $500 into $5000, and so on. Suffice to say that people don’t get their investments back. If the recipient is not blacklisted, has a business, and receives money regularly, training a system to detect such type of fraud is challenging, if not impossible, for now.

Payment card and identity fraud are closely tied to criminal activities that aim to launder money and conceal identities. Modern compliance and anti-money laundering (AML) investigations check social media accounts for suspicious posts and activities. To get around these checks, criminals buy inexpensive accounts created and maintained for a few years to develop a plausible online identity.

People who want to take on a different identity can buy a passport and a new identity with social media accounts, diplomas, and other documents for relatively little money. On one side, it’s easier to obtain a new identity than before. On the other, regulators and service providers are tightening security and making it more difficult to evade their checks.

The minimum cost of a brand new identity. Source: Safety detectives

In an attempt to balance convenience and security, security is losing. Customers don’t like long passwords and additional verification methods. They frankly don’t care if their information leaks because “they have nothing to hide and don’t have that much money anyway.” AutoML has the potential to slow down the advancement of financial fraud, and we’re about to find out for how long.

About the author

Pavlo Sidelov is a CTO at SDK.finance with 15+ years of experience in FinTech. A patented inventor, accomplished IT architect, journalist, and an author of The World Of Digital Payments book. Pavlo is currently working on a PhD in economics and banking.

Twitter

Pavlo Sidelov