Primary metrics and their BFFs: how to build a test story that holds up

Many experimentation programs default to conversion rate as the primary metric on every test. It’s a habit that feels safe.
It’s also why so many test results fail to tell a story leadership can act on.
Marcela Gutierrez leads digital analytics, experimentation, and consumer behavior at Fossil Group. Her team runs experiments across Fossil’s global portfolio. Her team has built a measurement discipline that produces results that are easier to trust.
The frame they use is simple: pick the metric closest to the change, then give that metric a “best friend” that corroborates the story.
In this article, we’ll walk through how that works, why session-based measurement can distort results, how to pick a primary metric that fits the test, and what a "BFF metric" looks like.
The move to user-level measurement
According to Gutierrez, the first move in maturing a measurement practice is leaving session-based metrics behind.
{{quote}}
A test that succeeds often means more return visits. That means if you only count sessions, the same user shows up multiple times in your results, dampening or masking the test’s true lift.
Switching to user-level measurement means each visitor counts once, and your math better reflects what actually happened.
{{blue-block-1}}
Pick the metric closest to the change
Conversion rate is not the right primary metric for every test. The primary metric should always be the one your change most directly influences.
{{quote1}}
If, for example, you test a filter widget and measure conversion as the primary KPI, you are asking the test to carry every other variable downstream of that filter. Which metric will the change actually move? That’s what will give you a clean read on whether the change worked.
Then, use the next metric down the funnel to confirm the change had a real consumer effect, not just a click. This is what Gutierrez calls the “BFF.”
Every primary metric needs a secondary
A single metric never tells the whole story. The secondary metric is what proves the primary metric meant something.
{{quote3}}
The BFF model produces a small set of metrics that build a narrative, as opposed to “just” an observation. The primary metric tells you the change worked, while the secondary tells you the change mattered.
{{blue-block-2}}
Set the hypothesis before the test
Marcela's team has one non-negotiable rule: always set the hypothesis before the test begins.
Without a stated hypothesis, you can rationalize almost any result. With one, you have a clear standard for what a win looks like, what a loss looks like, and which metrics earn the primary slot.
The team that writes "we believe X will move Y, because Z" has already done the hardest thinking.
Learn more about A/B testing vs hypothesis testing to explore this underlying logic.
{{cta-block1}}
Pivot when you know more
It’s always important to remember that measurement discipline is not static. The frameworks that work for your program today may not work next year. That’s not a bad thing for a team with a culture of learning and experimentation-led growth:
{{quote4}}
The teams that hold this disposition produce better measurements over time. Tests that utilize primary metrics, “BFF” metrics to corroborate them, and hypotheses stated before the test begins hold up under scrutiny.
Once you’ve built a few of those, the program can start speaking for itself.
{{cta-block}}
"When we started the program, we were session-based. I remember hearing someone speak about how session-based measurement was actually hurting experimentation programs, because a successful test would bring back returning consumers, but if you're not measuring at the user level, the math wouldn't add up correctly. We changed our approach immediately, and the difference was clear."

"Our primary metric isn't always conversion rate. If we're testing something higher in the funnel, like filters on a category page, my primary KPI is interactions with the filter. My secondary KPI is whether users are viewing more products. I want to corroborate the story."

"I think of the secondary metric as the primary metric's BFF. If I'm testing a marketing campaign, my primary might be click-through rate, are they coming to the site? My secondary might be bounce rate, are they actually staying, or just landing and leaving?"

"We know as much as we know today, and hopefully tomorrow we'll know more. The moment we do, we pivot, and we're transparent with the business when we do."

Kameleoon's experimentation engine and segments operate at the user level by default, which removes the temptation to fall back to session counting.
A guardrail metric, if you have one, tells you the change did no harm elsewhere. For more on how guardrails fit alongside primary and secondary KPIs, see guardrail metrics in experimentation.
Want to hear more? Marcela Gutierrez discusses primary and secondary metrics, prompt-based experimentation, and CRO analysis on Unite Voices, Kameleoon’s podcast featuring real stories from the people behind today’s most innovative experimentation programs.
Want to hear more? Marcela Gutierrez discusses primary and secondary metrics, prompt-based experimentation, and CRO analysis on Unite Voices, Kameleoon’s podcast featuring real stories from the people behind today’s most innovative experimentation programs.

See how Fossil Group built one of the most disciplined experimentation programs in retail.
See how Fossil Group built one of the most disciplined experimentation programs in retail.



