Flawed Methodology? in Advertising Fraud Study

I was traveling last week and hadn’t yet had time to respond to the data from this study (note: need to submit info, or see copy of it uploaded at http://bit.ly/adxpose) (here is the press release) put out by Radar Research and Mpire. Mpire runs the AdXpose service which gives companies information about where their ads are running and how many users are engaging with them (mousing over, spending time on the page etc.). I like their service and (full disclosure) we were one of the companies who beta-tested the product and have been paying users, using it as one of several tools we use to check on traffic quality.

Another disclaimer is that I used to work with the analysts at Radar Research (we were colleagues at Jupiter Research) and think they are good – and I would not hesitate to recommend them. In this case, however, even though they did a nice job of putting together the data that there is, I think that the underlying data and the methodology of the study has some problematic issues that are worth discussing.

My company is also being pretty proactive on a number of fronts in helping track down and combat impressions and click fraud in the exchange environment. Because it is an interconnected marketplace and we work with lots of ad networks, publishers and advertisers, we’ve shared data with any networks where we have found issues even if those issues don’t affect a revenue relationship between the two of us, and in some cases they have been able to share data back that has helped us confirm that certain cases that touch several networks come down to a very small number of bad actors. Sometimes pieces like this can blow some of the issues that there are out of proportion in terms of scale. And that is the problem I have with this study.

The problems mostly come about from the way I infer the traffic was run in the campaigns (infer since the full details are not provided, and I have an email in to the Mpire team and the Radar folks to get some more information and will update here if there is anything new to add).

My eye was caught by the sensational headline that “more than half of the impressions delivered and 95% of clicks came from suspected fraudulent sources”. Here is why that is not true (or at least, not representative):

CPC and CPA-targeted campaigns – attracting the worst traffic: This is the biggest and fatal flaw. At first we had assumed that this study had been done by running exchange campaigns that were CPM in nature, that would give the broadest overview of the traffic among the various sources. If we assume an average CPM of $0.50 for the traffic, that would be about $10,000 for a test. Apparently that cost is a bit too much. While the methodology in the report is a little hard to pin down, we did notice in the report where it stated (bold added by me):

Mpire also conducted a test directly on a top ten ad network, with the goal of proving the power of referrer and fraud data to unlock the hidden value of horizontal networks. The impressions were bought on a CPM-basis, rather than on a CPC or CPA basis. The results were revealing. While a large percentage (50%) of the impressions were never within view of a user, the click/impression fraud volume, while substantial, was significantly lower than on the exchange based buys.

So this seems to imply that the exchange buys mentioned above it in the report were NOT done on a CPM basis, but were done on a CPC or CPA basis. Why is this important? Well firstly while they don’t mention the Right Media exchange as the source it is probably fair to believe that it is. The optimization system that RMX uses helps campaigns seek out the sites and areas that will best meet their goals. If you create campaigns that have CPC targets, you are essentially telling the system to seek out the sites that have the highest clickthrough rates. If you are looking for but not getting conversions, the system will guess (usually correctly) that sites that have higher CTRs have a higher chance of generating conversions for you. For the publisher side, if they have inventory that is getting a high CTR and running a CPC campaign, it will have a higher effective CPM and thus get more of a share of it’s traffic. On the other hand a CPM or dynamic (ceiling bid) CPM campaign may be optimized towards a CPA but will not reward sites excessively based on CTRs if those clicks fail to deliver conversions.

The message here – your campaign setup may lead you to create a “click magnet” that has no countervailing balance that is created by seeking conversions. Especially if your campaign is small and tuned to clicks only (see next point), you will end up not seeking out conversions and exacerbating the effect of any bad sites in the mix since the system will seek those out and “optimize” based on them! Smart buyers in exchanges understand these effects and steer clear of campaign setups that will seek out these issues. Certainly, you shouldn’t create an experiment and call it representative of the marketplace when your experiment by design will seek out the very thing you are looking to measure. Unless you set up a campaign in a completely typical manner (which would probably be dynamic CPM with a CPA target, of a large enough size to actually get some conversions). A better but less sensational headline might be “Beware of setting up CPC-targeted campaigns because they find lower CPCs which may not be 100% valid clicks”, but jeez that’s way too clunky.

Finally on this point, note that in creating a CPC or CPA targeted campaign you are perforce likely getting the worst/bottom of the barrel traffic, including very high-frequency impressions (20+ impressions in a user session). Anything that could be sold at a higher CPM rate has been, and whatever is left is the stuff that the network/exchange has deemed to have a fairly low value. This is the other half of the self-fulfilling prophecy of seeking out the worst possible traffic imaginable and then calling it bad. It’s like gluing together the leftover wood scraps that have fallen on the furniture-maker’s floor then getting upset when the thing you made looks like it was made out of leftover wood scraps.

Campaign sizes matter, unless you’re not tracking conversions anyway:  The exact split per campaign is not there in the report, but 20 million impressions over 53 different advertisers is not that much per campaign/advertiser. Again, depending on how the campaigns were set up, there may not be enough data for individual advertiser campaigns to be able to optimize on conversions (in fact I’d go out on a limb and say there seems not to be much if any discussion about conversions and it doesn’t really appear that these were even being tracked for this test – setting up 53 conversion pixels is time-consuming, even tougher if you are running affiliate campaigns which would most often be the case in a test of traffic quality for someone to not offend an advertiser). Another potential bias that means the conditions are unlikely to reflect a real-world campaign at all.

(Begging the question just a bit) What counts as fraud?: Many media buyers look for high clickthrough rates with little or no conversions as indicative of fraud. We have, however, seen sites that are able to sustain high clickthrough rates that are not fraud. Recently we saw a site we had flagged that had a 3.3% clickthrough rate. Turns out upon investigation, they were smartly using interstitials and ad overlays on top of content and actually had a lowish but okay conversion rate on conversion campaigns. A few good sites we have worked with have 2%+ CTRs routinely and generate $6+ CPMs running CPA campaigns based on actual conversions they are getting on campaigns. It might be a good idea here to normalize the data (e.g use a median and not an average) so that the data are not overly skewed by sites that have excessively-high CTRs.  I directionally agree that high CTRs are indicative of fraud, but to wrap a headline around 95% of clicks coming from suspected fraudulent sources given this and the above points is a bit disingenuous.

So, a lot of flaws (more but I have not more time). Adxpose (Mpire) should be applauded for working on these issues. That’s a reason we support the product and we are happy to offer it to our clients as well if they want to test our media buys for them or any other ones they are doing. But to claim that this still-somewhat-murkily-setup study offers a representative blueprint of the state of the online ad network or exchange business and the prevalence of clickfraud or impressions fraud is ludicrous and may upon further study actually hurt the efforts many of us are working on to improve the industry. We should NOT, however, let this knock us off the target of stopping fraud, and making improvements for advertisers. That is our entire mission at CPM Advisors after all. But also, I feel that over-sensationalizing something that has a questionable methodology is not the way to go either.