---
title: "ML model building: Here's how Ravelin builds machine learning models"
date: 2020-02-24T13:53:00+00:00
author: Jessica Allen
canonical_url: "https://www.ravelin.com/blog/how-do-we-build-a-machine-learning-model"
section: Blog
---
Blog /[Machine learning](/resources?search=&category%5B0%5D=134549#resourceContainer "Go to Machine learning"), [Ravelin product](/resources?search=&category%5B0%5D=228587#resourceContainer "Go to Ravelin product")

# ML model building: Here's how Ravelin builds machine learning models

A machine learning model can make predictions. And Ravelin builds models that predict fraud, so how do we do it?

![ML model building: Here's how Ravelin builds machine learning models](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/_blogSmall/71633/448-Build-machine-learning-model-blog-images-885x505-01_2025-10-21-161520_sqdm.webp)

Today, we're going to examine closely how Ravelin custom-builds AI-native fraud detection for online companies who want to accept more payments with confidence and serve more customers safely.

Let’s imagine we’re building a [machine learning](https://www.ravelin.com/insights/machine-learning-for-fraud-detection) model to detect fraud for a food delivery business. Our fictional business is called **DeliverDinner.**

When **DeliverDinner** joins Ravelin as a new client, they start to send live transaction traffic to our API.

![](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/448-send-data-to-API.png)Every time a customer registers, adds an item to their basket, or does anything on the DeliverDinner website, a JSON request is sent to the Ravelin API. This means we store [a lot of data](https://www.ravelin.com/privacy-notice) about DeliverDinner customers and everything they’ve ever done in their account. We bundle these into customer profiles.

**To use this data for machine learning, we need to do three things:**

1. Label the customers as fraudulent/not fraudulent
2. Describe the customers in computer language
3. Train the model

### Step 1: Assign labels

![](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/448-Build-machine-learning-model-blog-images-885x505-03_2025-10-21-163012_tpxq.png)We look at any customer who has had a fraudulent chargeback or who has been manually reviewed as fraudulent by the merchant and label them as fraud. But we also label good customers as such. The goal is to have a wealth of examples of what a good customer looks like for this company, and also what a fraudulent customer looks like – be they an opportunist first party, professional cybercriminal or anywhere in between.

### Step 2: Create features

Creating features is equivalent to describing each customer in a way that the computer can understand. We are [using maths and data for fraud prevention](https://www.ravelin.com/blog/protecting-customer-journey-using-maths): In other words, we are to using them to describe the characteristics of a customer which indicate if they would be fraudulent or genuine.

This is based on the same aspects that a fraud analyst would look at to make the decision.

#### Some very simple examples of features which could be good indicators of fraud are:

- **Order rate**: Fraudsters order at a much more rapid pace, we quantify this as number of orders per week.
- **Email**: Fraudsters might have a dodgy-looking email – for instance, we may quantify this as the percentage of digits in the email address.
- **Delivery location**: It could be somewhere typically genuine/unlikely to be fraud like a penthouse apartment, or it could be somewhere fraudulent that implies a "drop location", such as a park. We quantify this as the location fraud rate %.
- **Card velocity**: The number of different cards used or attempted to be used by a customer within a reasonable amount of time can also be a fraud signal.

There are, of course, several much more elaborate features – yet the idea remains the same: Each supports the overall calculation of how likely a customer is to be a fraudster or abuser.

All features are created as a number, as the model can’t absorb raw text. We build up our features and categorize them into groups. We call these groups megafamilies – and we surface them on the Dashboard as well, to help our merchants know which aspects of a customer's presence are unusual and might indicate fraud.

### Step 3: Train the model

![](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/448-Build-machine-learning-model-blog-images-885x505-04_2025-10-21-163108_kaar.png)Next, we need to feed the algorithm the data so that it can learn how to solve the problem. At this stage, we feed in the training data.

The training data is a bunch of DeliverDinner data about customers, described in terms of their features and labels to let the algorithm know if they are a fraudster or a genuine customer. This helps the model learn how to tell the difference between genuine and fraudulent customers.

Within DeliverDinner’s dataset, this might show, for instance, that genuine customers tend to order around once a week, they tend to use the same card each time, and that the billing and delivery address are often the same. Fraudsters might show that they order several times a week, use lots of different cards, that their cards have failed registration and that the billing and delivery address don’t often match.

The algorithm will take this at face value, and learn the perfect way to check if a given customer's behaviour and characteristics look more like those in the genuine customer pile or the fraudulent customer pile.

When we show the model a new customer it hasn’t seen before, it compares it with the genuine and fraudulent customers it has seen before and produces a fraud score – a recommendation. Most of the features used to calculate this recommendation are unique to DeliverDinner. However, also taken into account are [consortium features](https://www.ravelin.com/blog/nuanced-consortium-data-at-ravelin), which look at the characteristics of fraudsters across Ravelin's 340+ merchants.

 This score represents how likely the new customer is to be fraudulent. On the Ravelin Dashboard, DeliverDinner's fraud analysts can see in detail which families of features contributed to a recommendation, including subfamilies. Below, for instance, the Consortium megafamily contributed 36 of the 98 points, with 33 of those points coming from the Email consortium contributor.

![data scoring](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/consortium-data-scoring-example.png)## Allow, review or prevent – and how do you decide this?

For the majority of DeliverDinner's customers, the fraud score will be quite low, as there are many more genuine customers than fraudsters.

- When the score is low, Ravelin recommends allowing the customer and the transaction to go through.
- If it’s a medium score, we recommend a Review of the transaction. For example, sending the customer a [3D Secure challenge](https://www.ravelin.com/solutions/3ds-product-3d-secure-server-sdks) to authenticate.
- If the score is very high we’d recommend blocking the customer from making the transaction.

Setting the right limits for Allow/Review/Prevent thresholds depends on precision/recall, as is customary in machine learning.

**Precision asks:** Of all the prevented customers, what proportion were fraudsters?

**Recall asks:** Of all the fraudsters, what proportion did we prevent?

![](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/448-Build-machine-learning-model-blog-images-885x505-05-v3_2025-10-21-163146_rtaj.png)If your prevent threshold is at 95, you’re blocking a very small percentage of customers. You’d have very high precision – you’re only blocking a few customers that you’re fairly sure are fraudsters. This means you'll have a very low [false-positive rate](https://www.ravelin.com/blog/reduce-false-positives-fraud). However, recall is likely to be low as there are likely to be fraudsters with scores under 95 which you’re not blocking.

Let's look at the opposite situation. If you have a block threshold of 5, you’re preventing a huge amount of your traffic and so you’re likely to have very poor precision – and probably end up with lots of false positives. You will have high recall, because as you’re going to block most if not all of the fraudsters.

Of course, those are exaggerated numbers – in reality, most fraud managers would not block everyone with a score of over 5, nor would they allow through everyone less than 95.

But there's a balancing act between the two. **Where you set your thresholds depends on your individual business priorities.** It’s easy to tweak these depending on your risk appetite, current goals, or if you are more concerned about chargebacks or false positives.

Sometimes, fraud managers think about fraud detection in terms of "accuracy". Yet, because [AI-native fraud protection](https://www.ravelin.com/blog/ai-fraud-detection-with-ml-and-nlp) such as Ravelin's is based on sophisticated machine learning algorithms, understanding precision, recall and setting risk thresholds is key for to assessing the efficiency and success of ML models, and make sure they are always is improving.

Ravelin builds custom fraud prevention for each of our merchants – which involves several models for each merchant, always improving and ensuring we continue to provide the best possible results.

![Ravelin Logo](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/ravelin-symbol-logo-transparent.webp)

## Let AI-native fraud detection power your growth

Find out how Ravelin leverages artificial intelligence to allow merchants to accept more payments and more customers with confidence.

[Ravelin's anti-fraud AI ](https://www.ravelin.com/blog/ai-fraud-detection-with-ml-and-nlp)

  

## Author

![Jessica Allen](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/_avatarSmall/3491/Screen-Shot-2019-08-13-at-16.12.52.webp)

Jessica AllenHead of Content (Ravelin alumna)

Jessica previously served as Head of Content at Ravelin.

[More from this author](https://www.ravelin.com/author/jessica-allen)

## Related content

[Blog / Payments &amp; payment fraud

### Safeguarding agentic commerce – fraud strategy advice by Ravelin's CPO

"If there’s anything fraudsters like, it’s a new thing." Here's how to protect your online shop from agentic commerce fraud – which can target you no matter whether you're actively adopting AI shopping or not.

![RAVELIN STAFF Mark Barlow Head Of product website](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/_33x33_crop_center-center_none_ns/175066/RAVELIN_STAFF_Mark_Barlow_Head_Of_product_website.webp)Mark Barlow,Chief Product Officer](https://www.ravelin.com/blog/agentic-commerce-fraud-prevention-strategy-analysis)

[Blog / Ravelin product

### Next-level reporting with Ravelin: Introducing Insights and AI-powered queries

Discover how the new Insights section and AI-powered queries in the Ravelin Dashboard simplify your fraud reporting.

![Ashleigh](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/_33x33_crop_center-center_none_ns/267519/ashleigh.webp)Ashleigh Luccini Gilera,Senior Product Marketing Manager](https://www.ravelin.com/blog/fraud-reporting-insights-ai-queries)

[Blog / Payments &amp; payment fraud

### DCAP explained: Boosting approval rates and reducing fraud with Visa's new program

Online payments are increasingly won or lost on decisioning quality. With DCAP, Visa is signaling that better context will be a first-class input to authorization.

![James hogan](https://storage.googleapis.com/ravelin-website-assets-production/assets/images/_33x33_crop_center-center_none_ns/274718/james-hogan.webp)James Hogan,Senior Product Manager – Payments Engineering](https://www.ravelin.com/blog/visa-dcap-program-explained)