In retail media, a big challenge is understanding whether ads actually create new sales - or whether they just take credit for sales that would have happened anyway. Many strategies still rely on "last click" reporting, which gives all the credit to the final ad someone clicks before buying. This often makes results look better than they truly are, especially for brand search ads. To avoid this, we focus on incrementality: would this sale still have happened if we had not shown the ad? We usually use 2 strategies for this: 1. Full Customer Journey Instead of only looking at the final click, we analyze the full customer journey. Using Amazon Marketing Cloud (AMC), we can connect: ad impressions (who saw which ads) ...and purchase data (who actually bought) This allows us to understand what happened before the purchase, not just the final interaction. We then compare two groups of shoppers: Group A: Shoppers who first saw an upper-funnel ad (such as Sponsored Display or Streaming TV), later searched for the brand and then bought the product Group B: Shoppers who only searched for the brand (and then bought) but did not see those earlier ads If Group A converts at a higher rate than Group B, it indicates that the earlier ad helped create demand. This approach gives us a much more realistic view of performance than standard ROAS metrics. 2. Using "New-to-Brand" as a Practical Signal Both Amazon and Walmart provide New-to-Brand (NTB) metrics, which show whether a customer is buying from a brand for the first time. While NTB is not a perfect measure of incrementality, it is a very useful indicator. For example: A campaign may show strong ROAS But if 90% of sales come from existing customers In that case, the campaign is likely low-incremental. Campaigns that drive a healthy share of new customers are generally much more incremental and valuable for long-term growth. Hope this helps! Cheers, Moritz
Geo-holdout testing is the one way to prove incrementality in retail media. It's the hard and fast approach for seeing the true value of my marketing budget. According to it, instead of looking at who clicked an ad, I split the country into two groups of cities. In the Test markets, I show the ads, and in the Control markets, I show zero ads. By comparing sales between these two groups, I calculate the sales that occurred only because of the ads. The perfect real-world example of that is the skincare campaign that we ran. In a 4-week test, we compared similar cities to see the real impact. We matched cities with similar populations and shopping habits. Then, we ran ads in the "Test" cities and kept the "Control" cities organic-only. The ads not only performed well but also created a 22% lift in sales. As a result, the actual return on investment was 2.3 times that of before.
The gold standard I used for proving incrementality in retail media was the geo-based holdout test. I chose this approach because it avoided halo-sale overcounting by measuring total business lift at a regional level rather than relying on platform-reported attribution. Instead of tracking individual users, I divided the market into matched geographic regions. I served retail media ads in selected cities while completely suppressing ads in comparable holdout regions. Over a four to six-week period, I measured total sales across both groups. The difference in total sales lift represented true incrementality. The key metric I focused on was incremental return on ad spend. Unlike standard ROAS, it isolated revenue that occurred only because of ad exposure. I closely monitored new-to-brand acquisition as a leading indicator of incrementality. In practice, this method revealed when high platform ROAS masked overinvestment. By analyzing total regional sales, including halo effects, I made more accurate budget decisions and optimized media efficiency.
The only way I've been able to prove true incrementality in retail media is by stepping outside platform attribution entirely and designing clean holdout tests. Last-click and even "new-to-brand" metrics inside Amazon or Walmart are useful directional signals, but they almost always over-count halo sales when brand demand is already strong. The clearest read we got was from a geo-based holdout test on Amazon. We paused Sponsored Products and Sponsored Display in a set of matched regions for three weeks while keeping pricing, inventory, promotions, and organic content identical. Instead of looking at ad-attributed sales, we measured total revenue lift and category share versus the control regions where ads stayed on. The result was eye-opening: reported ROAS looked great, but only about two-thirds of the sales were truly incremental. That insight helped us reallocate budget toward conquesting and upper-funnel placements that showed higher incremental lift. My advice is to measure blended outcomes like total sales and new-to-brand growth, not just what the platform claims you "drove."
To prove incrementality in retail media without over-counting halo sales, we rely on controlled holdout testing rather than platform-attributed ROAS alone. At Edi Gourmet Spice, the clearest signal came from pausing ads on a tightly defined subset of high-intent keywords or ASINs while keeping everything else constant, then comparing organic + paid sales lift against the exposed group. This approach helped us isolate true incremental demand versus customers who would have converted anyway. We also track blended metrics—total sales velocity and conversion rate at the SKU level—rather than ad-attributed sales in isolation. The key insight is that incrementality shows up in net new lift, not in-platform attribution dashboards, which tend to over-credit ads for existing demand.
The only way to understand whether retail media spend is adding new dollars rather than simply harvesting demand you would have captured anyway is to create a clean counterfactual. On Amazon, Walmart Connect and Instacart we do this by building a holdout or "ghost ad" test. Instead of looking at total attributed sales, which can include halo purchases that would have happened organically, we instruct the platform to withhold ads from a random portion of eligible impressions and treat that group as a control. Amazon and Instacart let you set up an "unexposed" audience where bids are still submitted but the ad is never shown; those ghost bids ensure the control shoppers look exactly like the treatment group in terms of targeting and intent. We run the test for at least two full purchase cycles and measure the difference in product-level revenue and new-to-brand orders between the exposed and unexposed groups. If the treatment group generates 30 % more units but only 5 % more new-to-brand customers than the control, we know most of the lift is cannibalising existing demand. One metric that has been particularly helpful is incremental ROAS using new-to-brand sales as the numerator. Because Amazon and Instacart report whether a shopper is buying your brand for the first time in 12 months, you can calculate how much of your spend is driving true customer acquisition. In one test for a beverage brand, we turned off all sponsored product ads for 20 % of ZIP codes for four weeks while maintaining organic rankings and other marketing. We saw that overall sales only dipped by 4 % in the holdout, but new-to-brand orders were 20 % lower. That told us the ads were mostly driving trial. We then increased bids for discovery keywords and reduced spend on branded terms, which improved incremental ROAS by 17 %. The key is to design experiments that mirror the bidding and audience logic of the platform, keep them long enough to iron out day-of-week noise and focus on metrics like new-to-brand sales or geographic lift rather than vanity attribution totals.
We prove incrementality by running SKU-level geo holdouts with matched control regions, not by trusting platform-reported ROAS. The cleanest read came from pausing ads on a small set of ZIP codes while keeping price, promos, and distribution identical. Example: on Amazon, we held out 10 percent of regions for a single hero SKU for two weeks. Reported ROAS stayed strong everywhere, but only exposed regions showed a true 14 percent lift in unit sales versus control. We also tracked new-to-brand rate, which confirmed the lift wasn't just cannibalized organic demand. That combination filtered out halo and over-attribution quickly. Albert Richer, Founder, WhatAreTheBest.com
Being the Founder and Managing Consultant at spectup, what I've observed while working with consumer brands is that proving incrementality in retail media networks is rarely straightforward, because sales that appear tied to ads often include halo effects from other channels. The key is creating a counterfactual a way to observe what would have happened without your spend. One test design that consistently gives a clear read is a geo or store-based holdout experiment. By pausing campaigns in selected regions while leaving others active, you can see true lift versus baseline sales. I remember running this for a growth-stage brand advertising on Amazon and Walmart simultaneously. We held out 10 percent of stores and monitored both paid and total category sales. Instead of just looking at reported ROAS, we focused on household penetration and new-to-brand purchases over four weeks. The holdout regions showed nearly flat category sales, while active regions had noticeable lift in incremental buyers. This immediately flagged that some "sales" attributed to Amazon were actually coming from Walmart or in-store activity. A metric I found particularly revealing was incremental penetration rate the percentage of new buyers directly attributable to the campaign versus total category buyers. It separates true new customer acquisition from repeat purchases that would have happened anyway, which is where most halo over-counting occurs. At spectup, we treat this as a foundation for budget allocation, because it shows not just efficiency but real business impact. The insight from that experiment changed our client's strategy: they shifted more spend toward campaigns that genuinely drove incremental households rather than chasing volume that inflated apparent returns. In my experience, the only way to be confident in retail media incrementality is to remove spend somewhere measurable, watch carefully, and measure against the real behavior you care about not just the last click report.
To prove incrementality in retail media networks, I rely on geo-split or audience-holdout tests instead of platform attribution alone. One clean method is suppressing ads for a matched control group while tracking baseline sales. The clearest metric is lift versus holdout, not ROAS. In one test, Amazon ads showed strong platform ROAS but only modest incremental lift. That insight helped reallocate spend to channels driving net new sales instead of halo overlap.
The simplest way to prove incrementality in retail media is to use a true control group. Without a control, it is very easy to over count sales that would have happened anyway. The clearest test design is a geo split test. You turn ads off in a small set of similar markets and keep them running everywhere else. Nothing else changes, same pricing, same promos, same inventory. After the test, you compare sales between the two groups. The most useful metric in this setup is incremental revenue, not ROAS. ROAS often looks great because it credits ads for sales that were already going to happen. Incremental revenue shows what the ads actually added. For example, if markets with ads generate $120,000 in sales and markets without ads generate $100,000, the real lift is $20,000. That number is what matters. The key takeaway is simple. If you want real answers instead of inflated numbers, always test against a control and focus on lift, not attribution.
I'll be direct: the cleanest incrementality read I've seen comes from geo-holdout testing with a 4-6 week pre-period baseline, but you need to slice your control and test markets by fulfillment velocity, not just demographics. Here's why that matters from what we've observed working with brands across Fulfill.com's network. Most brands make a critical mistake when testing retail media incrementality. They match test and control markets by population or purchase history, but they ignore fulfillment speed differences. We worked with a supplement brand testing Walmart Connect ads who initially showed a 340% ROAS. Looked incredible. But when we dug into their fulfillment data, their test markets had 15% faster delivery times than control markets due to warehouse proximity. That speed difference was driving 30-40% of their attributed lift, not the ads. They were massively over-counting incrementality. The test design that gave us the clearest read uses geographic splits with fulfillment-matched controls. You need markets where delivery speed, warehouse proximity, and inventory availability are nearly identical between test and control groups. Then run your retail media campaigns only in test markets for 6-8 weeks. The key metric is incremental sales per impression normalized by fulfillment capability. We call it velocity-adjusted incrementality. Here's a specific example: A home goods brand tested Amazon DSP with 20 matched metro pairs. Standard demographic matching showed 280% incremental ROAS. When we re-matched those same metros by Prime delivery speed and FBA warehouse distance, true incrementality dropped to 165% ROAS. Still profitable, but nowhere near their initial read. The halo effect they thought was ad-driven was actually fulfillment-driven. Customers in faster-delivery markets naturally convert higher regardless of ad exposure. The other critical piece is purchase sequence tracking. We've found that 25-35% of attributed retail media sales are actually halo sales from customers who would have bought anyway but clicked an ad during their research phase. To isolate this, track new-to-brand versus repeat purchaser conversion rates separately in your holdout test. If your repeat purchaser lift is significantly higher than new-to-brand lift, you're likely over-counting halo. The bottom line from our data: most brands overestimate retail media incrementality by 40-60% because they ignore fulfillment variables and halo contamination.