A consideration for any large online business is whether to build a fraud solution internally or buy from a solution provider. We have an opinion but let’s be clear from the outset: we’re biased. We've spent over five years developing tools and techniques that represent the best in class in fraud prevention. We are extremely proud of that.
But, we deal with clients large and technically competent enough to consider building their own solution. We also work with clients who have a mixture of their own technology and some of ours (the hybrid model).
So, what are the discussions we have with our friends at these companies on which is the right direction? What are the considerations when making this decision?
Is fraud detection a core competency for your business?
Building a good fraud detection system is not cheap and nor is it easy. So it had better be important. For comparison, very few companies build their own payments processing system even though collecting revenue is core to any online business. So why do some businesses consider building their own fraud detection?
In some cases it is because the effort required is underestimated. It is very easy to build a basic fraud detection system that degrades rapidly.
To build, maintain and support a system is a significant undertaking. So we go back to the core question - why build in-house? Ask yourself:
- Does your business hinge on being able to accurately predict risk?
- Is it a natural extension of existing systems and skills you already have?
- Is the nature of your business or its risk so unique that you have no choice but to build internally?
- Is there a regulatory reason that compels you to do it in-house?
- Is it a competitive advantage in your market to do it in-house?
It would be reasonable to assume the answer at least one of these questions is yes before going further. So what other considerations are there?
Do you have sufficient data to create efficient fraud detection models?
We will assume for a moment that any business of scale is going to use machine learning at the core of its fraud detection strategy. Working on this assumption, how much data is enough to feel confident to begin?
At Ravelin we believe (and have proven) that the most predictive data is a merchant’s own. That is why we build bespoke models for each of our individual clients. Any merchant of substantial size (e.g. >5ML transactions a year) should have enough data to build some pretty great models.
What they will never have, though, is access to data sets beyond their own.
This is important because the ability to test and tune models in a variety of environments is a key defence to overfitting. It is better to move from a general model and then adapt to to a specific dataset. A single merchant will only ever have the specific model, which can give good but never optimal results.
Matching domain with technical expertise
Any large merchant will have a team familiar with the fraud that the merchant faces every day. Usually in that team some will have brought with them learnings from other companies. Actual fraud expertise is not usually an issue.
What is harder to do is to translate that expertise into data science and do so consistently. Data science teams in e-commerce businesses are generalists. Working on pricing algorithms for one project and fraud detection the next. This has definite benefits. Prime amongst them would be the ability to keep data science talent engaged as they get to work on a range of problems. What this costs is the consistent application by the data team to the problem of fraud.
As Ravelin has matured, one skill we've perfected is the ability to turn fraud insights into millions of tested and validated features and model inputs the scale of which would e very difficult to do in-house. This is the result of an investigations and client liaison team in lock step with a data science team that is permanently focused on the issue of fraud. Sounds straightforward. The secret is motivating a team in the long term if they are only working on a single set of merchant data.
Beyond Version 1.0
It’s easy and fun to ship a prototype, whether in software or data science. What’s much, much harder is making it resilient, reliable, scalable, fast, and secure.
We go into great detail in this blog post about our data science best practises; hard won knowledge from five years in the trenches. We hope it's useful, but know it's instantly out of date from where we are now not five months since it was written.
When push comes to shove in many organisations, it's highly tempting to see the fraud project as "finished" and to move the key staff off to other priorities. Or to have the preferred machine learning approach shape the fraud detection approach. For instance having in-house neural net expertise might pre-determine this approach for fraud detection. A key learning of ours is that any single technique quickly hits limits of usefulness. The skill is in mixing techniques and having the skills to do so.
In the meantime, fraud and fraudsters respond and change; the world moves forwards but your in-house solution does not.
A major consideration for any buy vs build evaluation is how confident you can be of guaranteed budget and resourcing of highly sought after data scientists in perpetuity.
Innovations in fraud
We have focused so far on the most common fraud detection scenario - payment card fraud predicted using machine learning. But this is only part of the fraud picture. Our clients all use Ravelin for at least one of the additional services on top of that. I will describe them briefly with links to more information:
- Network Analysis: the instant creation of graph networks showing the relationships between entities in a database. This is vital for investigations analysis. It also boosts ML predictive capabilities by analysing networks.
- Account Takeover Defence: A combination of security checks, data analysis and detection that looks to secure the accounts against the constant breaching efforts of fraudsters.
- Marketplace Fraud: 360 degree analysis of the fraud threat for an online marketplace. From the supplier to the courier to customer - each element is a potential risk and different techniques are required for this complicated picture.
- Authentication and Acceptance: Increasingly, success in payment is related to how many good payments you can get accepted without friction; not just stopping bad payments. Regulation and legislation is rapidly changing this landscape and the investment to stay on top is daunting.
- Shared datasets. A useful fraud check between similar businesses is to see if certain identifiers have been flagged as fraudulent by other merchants. This could be an email, phone, IP address, or payment method. This is only possible through anonymously shared data via a third party.
As the nature of fraud attacks evolve the techniques and technology required to defeat them is endless. This is the core conundrum in the buy vs build decision. It is not a one-time decision. It is an on-going and significant investment. This is true whichever way you choose of course. The real decision is which is likely to result in the best outcome to your business. We’d be happy to have this discussion any time.