Table of Contents

Explore AI Summary

Share post on:

How to Benchmark Fraud Performance and Find Hidden Gaps

Fraud teams live and die by their numbers. But which numbers actually matter, and how do you know if they’re telling the full story?

Press-Release-Tile-Image-Color-Pills_Blue

Fraud teams live and die by their numbers. But which numbers actually matter, and how do you know if they’re telling the full story?

That question sat at the center of Sift’s Blueprint session, How to Benchmark Fraud Performance and Find Hidden Gaps, featuring Jerry Hoff, Founder and CEO of AppSec Training, in conversation with Maria Benjamin, a Trust and Safety Architect at Sift with deep experience at Venmo and Step.

The discussion focused on a challenge that’s deceptively simple on the surface: how do you benchmark fraud performance in a way that actually drives better decisions? As fraud programs grow more complex, spanning payments, account security, and product experience, raw numbers without context can be just as dangerous as no numbers at all.

Below are the biggest takeaways from the conversation.

Benchmarking is about context, not just counting

Benchmarking isn’t simply tracking metrics. It’s about understanding what your numbers mean in relation to your business, your industry, and your own patterns over time.

The goal is to take raw numbers like acceptance rates, chargeback rates, and transaction volume and put context around them. When do customers spend the most? What does a normal week look like versus a holiday? What’s actually happening when a metric moves?

Without that context, you can’t identify your blind spots or understand the tradeoffs you’re making. As Maria put it: “If you have no chargebacks, are you leaving money on the table? Have you made your product really unusable for people?” A 0.4% chargeback rate might be excellent or alarming, depending entirely on your industry and stage.

Most teams track something, but often the wrong thing

Nearly every fraud organization measures some kind of performance metric. The problem is that teams frequently end up chasing moving targets. They solve one fraud problem, declare success, and then discover a different issue is brewing in a metric they weren’t watching closely enough. 

The session’s live poll confirmed this dynamic: 46% of attendees said chargeback rate was their primary KPI, with block rate coming in second. Account takeover rate and false positive rate received almost no votes, indicating a telling gap, since executives frequently cite false positives (legitimate customers being incorrectly blocked) as one of their top concerns.

The takeaway: if your fraud team is optimizing for chargebacks alone, you may be missing exactly what leadership cares about most.

Your North Star should match your stage

One of the most useful frameworks from the session was the idea that the right benchmarking target depends on where your business is in its growth journey, and that target should evolve over time.

For early-stage companies, the North Star is often user growth and reducing friction. At that stage, some level of fraud loss may be acceptable. The bigger risk is over-engineering controls before you have product-market fit. As Maria explained: “Getting customers is not my problem once you have product-market fit. That’s when everything else becomes the problem.” The goal in the early days is to get people through the door, learn their behavior, and build trust over time, not to build the most aggressive fraud system on day one.

For mid-market and enterprise organizations, chargeback rate typically becomes the dominant KPI because it’s tangible, reportable to finance, and squarely within the fraud team’s control. But even then, it’s an imperfect metric. Not all chargebacks are fraud. A customer who received bad food isn’t a fraud signal, and chargeback rate alone says nothing about the revenue you’re leaving on the table by blocking legitimate users.

Friction placement matters as much as friction amount

The session surfaced a key insight about where to apply friction in the user journey: put checks at the point where a customer is most motivated to complete them, not at the beginning before they’ve experienced your product.

Jerry shared a personal example. A VPS provider asked for passport verification before he’d even spun up his first server. He walked away and signed up with a smaller competitor instead. The more established company may never have known they lost a potentially high-value long-term account over a single friction point at the wrong moment.

It’s also the most effective to put friction where customers are most motivated to clear it. “Think about where you’re putting the friction, not just what you’re introducing,” Maria said. Someone who has already deposited funds into an account is far more willing to verify their identity before withdrawing than a brand-new user who hasn’t experienced the product yet. Think about where customers are most invested. That’s where friction is both more acceptable and more effective.

Chargeback benchmarking models each have tradeoffs

When it comes to measuring chargeback rate, there’s no single universal standard, and the model you choose shapes what your data can actually tell you.

Three common approaches are:

The Mastercard model: This month’s chargebacks divided by this month’s transactions. Fast and easy, but less accurate because chargebacks often arrive weeks after the original transaction.

The Visa model: This month’s chargebacks divided by last month’s transactions. Slightly better at reflecting real timing, and more commonly used in practice.

The data science model: chargebacks tied back to the specific month the original transaction occurred. The most accurate for understanding cause and effect, but it requires waiting 90 to 180 days for the data to fully settle.

The poll showed that nearly half of attendees preferred the data science model. While it’s the most precise, it’s the hardest to use in operational reporting because of how long it takes to validate.

The right model depends on what you need: immediate directional feedback or accurate long-term measurement. Most mature teams need both.

Know your “Super Bowl”: industry seasonality changes everything

Benchmarking doesn’t happen in a vacuum. Every industry has its own version of peak activity, and fraud patterns shift right alongside it.

For iGaming platforms, the Super Bowl is literally the Super Bowl. For food delivery companies, it might be Mother’s Day or a Cinco de Mayo Taco Tuesday. For tax-season fintech platforms, it’s the days after refunds hit.

If you benchmark fraud performance without accounting for these seasonal surges, your numbers will mislead you. A spike in transactions that looks like a fraud attack might actually be a wave of legitimate customers who just received a government payment or discovered your platform through a promotion. Knowing your own “Super Bowl” is what separates teams that investigate intelligently from those that block real customers by accident.

The goal isn’t zero fraud. It’s maximizing revenue.

Perhaps the most important mindset shift is reframing what fraud prevention is actually for.

Zero chargebacks is not the goal. It can’t be. Even the most sophisticated fraud systems will occasionally let through a determined bad actor, and someone willing to lie will eventually find a way through. Chasing zero means building a system so restrictive that you’re turning away good customers at a rate that quietly erodes your bottom line. As Maria put it: “When you’re implementing a fraud solution, it’s not to actually get rid of all fraud. It’s to increase the revenue at the bottom line.”

A strong fraud program is one that reduces losses while protecting your ability to approve legitimate transactions, retain customers, and scale. When fraud teams internalize that framing and measure accordingly, they stop being the team that says no to everything, and start being a function that makes the business more competitive.

Building a benchmarking program that lasts

Effective fraud benchmarking requires knowing which metrics matter for your specific business, understanding where your benchmarks come from (internal baselines, industry standards, or both), and tracking performance over time with enough context to distinguish signal from noise.

Tools like Sift’s Fraud Industry Benchmarking Resource and industry resources from the Merchant Risk Council can help fraud teams understand what “good” actually looks like in their category. But no benchmark will tell you what to do. That still requires fraud leaders who understand their product, their customers, and their own data well enough to make smart tradeoff decisions.

That’s what separates teams that are just watching metrics from teams that are actually using them to get better.Check out more from The Blueprint webinar series.

Dare to grow differently.

Flip the switch on fraud-fueled fear. Make risk work for your business and scale securely into new markets with Sift’s AI-powered platform.

see sift in action
  • remitly
  • swan
  • yelp-white
  • taptap
  • remitly
  • swan
  • yelp-white
  • taptap