Why Your A/B Tests Fail

The Hidden Power of Testing Hygiene

Welcome to the second edition of the Conversion Ledger.

Had a sales call this week with a brand who was several frustrating months into running A/B tests on their site. They’d tried swapping hero images, rewriting headlines, and streamlining checkout flows.

Their ideas for testing were solid. Their designs? Spot on.

However, they were not happy with the results. It had been a mixed bag of inconclusive data, and the few “wins” didn’t seem to translate to real revenue.

You can imagine they were frustrated, right? A sizeable investment with little or no return. In a time where everything not only has to have a return, it needs a 5x return or better.

A ton of brands think testing doesn’t work because they’ve tried it, either in-house or with an agency that "also does CRO." But nothing sticks.

The problem isn’t the ideas or the designs. The issue is testing hygiene.

Testing hygiene is the foundation of successful experimentation. Without it, even the best ideas will fall apart. Let’s dive into the four essential steps to ensure your A/B tests deliver results that move the needle.

1. Define Success Before You Begin

The most common mistake in A/B testing? Launching without a clear hypothesis or a meaningful metric. Without these, you’re just guessing, and guessing doesn’t pay the bills.

Start every test by asking:

  • What problem are we solving?

  • What do we believe will happen if we solve it?

  • What metric will prove we’re right (or wrong)?

Your metric should align with both the test change and your business goals.

For example, if you’re testing a new buy box layout on your PDP, your primary metric should be add-to-cart rate, not visits cart. The closer your metric ties to the change, the better. This is where many teams get tripped up—choosing metrics that sound good but don’t matter.

2. Plan for Every Outcome

Another testing pitfall is the lack of a game plan. What happens when the test ends? Will you:

  • If the test wins: What will you implement?

  • If it loses: What’s your fallback?

  • If it’s inconclusive: How will you move forward?

Create an action plan before you launch the test. This removes the temptation to spin the results to fit a narrative. It also prevents “decision paralysis,” which can delay progress by weeks or months.

Good CRO teams know what they’ll do with every possible outcome before the test even starts.

3. Avoid the Mystery Win: Underpowered Tests

So many winning A/B tests are failures because they’re underpowered. If your test doesn’t have enough traffic or data to detect meaningful changes, you’re setting yourself up for disappointment when it doesn’t impact revenue as expected.

To avoid this:

  • Know your baseline traffic and conversion rates. → If you’re not sure, tag the element you’re testing with an event and gather data for a couple of weeks.

  • Don’t test tiny changes on low-traffic pages. → Focus on impactful changes and high-traffic areas first.

  • Calculate your sample size. → Use an online calculator to determine how much traffic you need for statistical significance.

Testing requires patience. It’s better to wait a week or two to gather enough data than to rush into an underpowered test that leaves you guessing.

4. Set Stopping Conditions

How do you know when a test is done? Without clear stopping conditions, you risk cutting your test short (or running it way too long). Both lead to bad decisions.

Before you start, decide:

  • How much traffic do you need?

  • What’s your baseline conversion rate?

  • What’s your confidence threshold (e.g., 95%)?

These rules ensure you’re drawing conclusions from reliable data, not gut feelings or premature results. Discipline here is non-negotiable.

Here’s our favorite tool for doing steps 3 and 4.

The Bottom Line

Bad testing hygiene doesn’t just waste time and money—it erodes trust in the testing process. But the good news? It’s entirely avoidable. With a little planning, patience, and focus, you can turn your A/B tests into a powerful driver of revenue and growth.

Let’s recap:

  1. Define a clear hypothesis and meaningful metric.

  2. Plan your actions for every possible test outcome.

  3. Avoid underpowered tests by knowing your numbers.

  4. Set stopping conditions to ensure reliable results.

When testing is done right, it moves the needle in meaningful ways.

If you’re tired of spinning your wheels with testing that doesn’t deliver, let’s chat. My team at Surefoot specializes in helping e-commerce brands achieve 9–25x ROI through rigorous, results-driven CRO.

Looking forward,

Brian

Test Learnings:

Store: Lovely Skin

➡️ Additional Desktop Transactions: 1.9%

➡️ Increase In Revenue Per Visitor: 2.3%

Here’s what we changed:

We took their green font “In Stock” message already on PDPs and changed the color to “on sale red” anytime the number of items in stock dropped below 20. We also changed the message to indicate the number left in stock: "Only # left in stock!"

We continued this same treatment on the cart to further drive a sense of urgency.

Quote that stopped me:

"Everything is written in pencil. It will either be obsolete or proven wrong.”

Ric Elias, CEO Red Ventures