Financial services customers are operating with faster speed and greater agility in the payments space, now more than ever with the incremental volume of card-not-present transactions, accelerated adoption of contactless payments, and new financing options at the point-of-sale, among other key trends.
As transactions become digital, opportunities for financial crime increase and fraudulent payments create long-term risks. For one, fraudulent transactions impact payment companies’ bottom line in a sector experiencing slimming margins. Second, regulations intended to protect consumers are shifting ownership of reducing fraud to the payment service providers.
Regulations such as PSD2 in Europe mandate transaction risk analysis (TRA), to further reduce fraud on remote payments where the customer is not physically present. Therefore, financial institutions (FIs) can take a proactive approach to payments authentication to combat fraudulent activity and mitigate these ongoing risks.
As the payments landscape evolves and fraudsters improve their methods, the way to keep one step ahead is to analyse all available data—historical and real time—and apply machine learning tools to decipher legitimate transactions from illegitimate. Modern fraud-prevention solutions must include dynamic rules and be real time, self-improving, easy-to-maintain, and scalable. Whether to comply to new regulation, protect consumers, or preserve margins, FIs are looking for better ways to identify and prevent payment fraud in an ever-evolving digital payment landscape.
Finextra Research speaks to Mark Smith, worldwide head of business and market development for payments, Amazon Web Services (AWS) and R. Whitney Anderson, CEO and co-founder of Fraud.net. Here they discuss payment trends and strategies for leveraging machine learning, data lakes, and analytics to help FIs provide a frictionless customer experience, while also preventing illegitimate transactions and protecting consumers—as well as their own bottom line.
How to cut costs by scaling fraud prevention
Preventing fraud is important. However, being overly conservative also has consequences for consumers, merchants, and issuing banks—including false positives where transactions are flagged as fraudulent and declined, when they shouldn’t be. The solution requires data analytics at scale and machine learning, but FIs must also strike a balance concerning friction in the customer experience. Smith highlights that across the payments ecosystem, “What we’re hearing from customers is that transaction risk analysis is a step in the right direction, as long as it doesn’t cause any delays in the shopping experience.”
Fraudulent purchases are being prevented, even before checkout, via the use of historical and transactional data in real time. “Payments processors and banks can use the low-risk TRA PSD2 exemption to avoid authenticating customer” says Smith. “But the ability to authenticate with behavioural biometrics ensures a better and secure customer experience, compared to the addition of a one-time password or knowledge-based assessment, which are components of Strong Customer Authentication (SCA) and PSD2.”
In addition to real-time and historical data points, machine learning algorithms can detect and prevent highly probable fraudulent transactions from being approved, while simultaneously reducing the number of false positives. Smith points out, “Migrating on-premises databases to the cloud lets machine-learning models and data analysis help deliver the same performance as legacy systems—at half the cost.”
“FIs are using machine learning to reduce the number of false positives and decrease friction,” he notes. “It’s refreshing to see companies starting to find the balance between protecting the customer and their bottom line, with a good customer experience and acknowledgement that false positives cost money, impact customer engagement, and—subsequently—impact loyalty.”
How to fight fraud with scalable and flexible infrastructure
Fraud.net, the leading crowdsourced fraud prevention platform that runs on AWS, leverages a machine learning model’s ability to help drive more accurate predictions, saving time and money. Anderson explains that machine learning can help reduce complexity, make sense of emerging fraud patterns and correlations, and answer questions that would have taken humans too long to answer.
Machine learning can also counter increasingly different and evolving forms of fraud. FIs are now acknowledging the need to build and train more targeted, precise machine learning models that reduce financial risks, while detecting and preventing fraud by maintaining fast response times.
Data lakes can support identity authentication with passive biometrics and protect customers from fraud. When running a data lake, institutions activate a broad suite of machine learning tools and optimised payment fraud detection algorithms.
This helps FIs provide sufficient protection, while reducing false alerts. Smith explains that amid recent shifts in the economy, the banks, processors, and large payment networks that have been ramping up fraud prevention have done so by each building a data lake and utilising structured and unstructured data, to support behavioural biometrics or machine learning in making quick and accurate fraud decisions.
As Smith explores, such advanced measures also create a good customer experience: “Here, we can compare companies that are still doing batch-transaction fraud analysis and using straight fraud scores to remediate fraudulent transactions with the likes of Fraud.net that are using real-time data, constantly updating their machine learning algorithms and models to stay ahead of fraudsters—and ultimately protect their customers.
“The cost of false positives driven by customer service or customer trust erosion and chargebacks are significant. The technology companies that do this well are using machine learning on the cloud and are able to spin up distributed training environments on demand like Amazon Sagemaker, which allows FIs to retrain their fraud detection models with the most recent data and react to the changing environment or changing customer behaviours. Distributed training increases efficiency and speed of getting models into production.”
Anderson notes that the initial challenge for FIs is siloed data in different parts of the bank that do not communicate or collaborate with one another. “In every engagement, there’s a necessary data-discovery and data-orchestration stage where, first, we merge our proprietary data on one billion identities, understand transactional history, and understand behaviours leading up to a transaction.” He elaborates, “While it is sometimes a months-long data organisational effort, what it does is tees up a good machine-learned model.”
As explored in AWS’ recent whitepaper Machine Learning: Best Practices in Financial Services, with cloud and machine learning, traditional data points can be incorporated with non-traditional data points, like biometrics. This not only accurately authorises, but also authenticates and confirms, the identity of the user behind the purchase, deciphering legitimate transactions from illegitimate.
How to perform data analytics at scale
Data lakes of massive scale can also support data storage and the processing of billions of events per day, which is significant—as reacting quickly to new information and leveraging timely insights is of paramount importance.
Anderson believes that “the bank’s interaction with a customer starts upon arrival on the web application or the webpage. So, measuring and tracking behaviours from that very first page load all the way through to their last transaction, account change request, or post-transaction financial event is critically important.”
He adds: “Banks, currently using ageing systems, have a difficult time both capturing the data across a customer’s life cycle and then merging it to make more intelligent, informed, and accurate decisions.” He continues, “The cloud is the only obvious way of merging all that siloed data, making sense of it, and giving the bank’s customers greater safety against fraud and, ultimately, much better user experiences.”
Managed services on cloud must be used to build, train, and deploy fraud prevention models to production, so that potentially fraudulent activity can be automated after spotting existing patterns and correlations.
How to provision and operationalise ML workloads
Regarding fraudulent payments creating long-term risks, Anderson and Smith both agree that fraud evolves at the same rate as technologies to combat them do, and some fraudsters outpace the technology—so companies are taking actions to ensure that illegitimate payments are mitigated. “Machine learning fraud prevention is being made more accessible by managed services. And cloud providers like AWS are supporting FIs from an engineering and business perspective,” states Smith.
“It’s about doing our best to constantly stay ahead of fraudsters and bad actors by using data in a more sophisticated way, with machine learning, and continually updating your models with the best and most recent data,” Smith states. Furthermore, as the number of remote transactions increase and new vectors open up and are attractive to fraudsters, it is evident that fraud prevention is an ongoing and uphill battle.
A typical machine learning lifecycle
Source: Machine Learning: Best Practices in Financial Services
To address the ongoing risks of fraudulent payments, business problems must be framed into machine learning problems, with defined inputs and outputs. After gathering success metrics, key performance indicators (KPIs), or information on regulations that may dictate the validity of the model, next, organisations must collect input data from various sources.
Anderson says that when an FI is handling remote customers, all they have is data, and as fraud also takes “hundreds of different forms and pathways and fraudsters exhibit many different types of behaviours,” banks must be rigorous in capturing as much data as they can, at every point of the customer lifecycle. Then a human must label the data as fraudulent. Only after this does it become a data science exercise, and those specific behaviours can be prevented with a machine learning model.
This is why it is extremely important to be aware of the type of fraud that needs to be mitigated. Without labelled data, FIs can only access unsupervised models. Specific behaviours can only be stopped with supervised learning, which requires labels, the most effective way of detecting certain types of fraud.
The AWS whitepaper highlights that this role is “often performed by engineering teams familiar with big data tools for data ingestion, extraction, transformation, and loading (ETL),” so the data is versioned and the lineage of the data can be tracked for auditing, as Smith noted earlier.
Anderson explores this further, stating, “An extremely important first step in creating a good machine learning model is, upon ingestion, creating new features and shedding irrelevant ones that do not correlate to fraud.”
Modelling performance against identified KPIs, improving with hyperparameters, and incorporating labelled training can help the model learn more and minimise false positives. However, before deployment, trained models must be tested on historical data performance, typically with a manual human-in-the-loop to ensure the model is compliant and meets business goals.
How FIs can create secure machine learning environments on the cloud
FIs must consider best practices to create a secure machine learning environment on the cloud and, at the same time, support the machine learning model’s ability to understand and drive more accurate predictions in a compliant, well-governed, and secure manner.
Smith summarises four common considerations for FIs as they set up a secure machine learning environment:
1. Compute and network isolation: A well-governed and secure machine learning workflow begins with establishing a private and isolated compute and network environment to control both ingress and egress of data into and out of the environment, as well as control access to only the resources required.
2. Authentication and authorisation: After an isolated and private network environment is created, the next step is to ensure that only authorised users can access the appropriate services. This brings us to authorisation and authentication, where companies and customers can grant explicit permissions to specify the principle and the ‘who,’ as well as the actions or API calls and resources.
3. Data encryption: Since machine learning environments can contain sensitive data and intellectual property, it’s recommended that data encryption is enabled at rest and in transit.
4. Auditability: For a well-governed, secure machine learning environment, having a robust and transparent audit trail that logs all access and changes to data models for configuration or hyperparameters is paramount.
Anderson notes that “very often, banks and other large organisations expect machine learning to provide magic bullets and solve whatever problem they’re having. It’s important to be very clear about what problem you’re trying to solve before you embark on your machine learning building and testing.” He concludes, “If banks haven’t taken any measures in the past three to four months to harden their controls and implement fraud prevention programs, now’s the time, and the cloud is the only obvious means of executing on new strategies when it comes to fraud prevention.”
Click here to read 'Machine Learning Best Practices in Financial Services.'