If you’ve read our blog post on problem statement vs the hypothesis, you should already know how to set the foundation to a successful experiment. However, analysing your data and results is also one of the most important stages of any experiment.
Unfortunately our cognitive biases and wishful thinking can often impact how accurately we are able to interpret data. At CCX, we’re experts in experimentation and have fine-tuned our methods for reducing bias throughout the analysis process. This article will lay out some top tips and the process we undertake to ensure data is never misinterpreted.
Misinterpreting data: what impact does this have?
To answer this question, let’s consider the impact that correctly interpreted data can have on a business and its customers. With any experiment, we’re looking to gain valuable insights from the data. These insights will tells you what sort of impact an implementation change would have on our revenue and other key metrics. Correctly interpreting the data means we can be confident in the insights we have deduced for making changes permanently, fuelling future iterations or even testing a completely new hypothesis.
On the other hand, misinterpreting data can lead you to implement changes that are actually detrimental to a business, its customers and most importantly its bottom line. This is only made harder by our innate cognitive biases and often yearning for a successful experiment.
So what is the best approach to analysing your experiments?
At CCX, we’ve analysed hundreds of experiments and typically follow a five-step process. Taking such a thorough approach helps ensure we are confident in the results and learnings we provide to our clients.
The five steps are as follows:
1. Decide on the outcome of your experiment
Very simply put; you need to determine what the result of your experiment was. Did it succeed? Or did it fail?
Of course, the answer isn’t always clear cut. We’ll break down how best to reach a definitive answer in a moment, but this is a very important question to ask yourself. A wrong answer at this stage could result in you implementing a website change that doesn’t end up benefiting your customers or your bottom line.
When should you consider the experiment outcome?
Contrary to most other experiment implementers, we like to start thinking about the performance of the experiment before it’s even gone live. This helps you define the criteria needed to declare your experiment a success, not only for your primary metrics but also for any secondary metrics you are tracking.
Defining your success criteria upfront will make it easier to decide the outcome of your experiment.
Experiment outcome example
Say we’re offering visitors who sign up to our newsletter a 10% discount.
Our primary metric would be to measure and compare newsletter sign-ups and our secondary metrics would measure other aspects such as:
- Journey progression
- Conversion rate
- Revenue per visitor
- Engagement metrics with form fields on sign-up
Our results are in and they show that newsletter sign-ups have increased and more users are progressing through the funnel, which has resulted in more conversions. However, our revenue per visitor has decreased because of a 10% discount.
So would this be a success or a failure?
Setting your criteria before an experiment goes live allows you to answer this question quite quickly. For the example above, you might state that in order for the experiment to be classed as a success, there should be an increase in newsletter sign-ups of at least 20% and a rise in conversions of at least 5% to outweigh the reduction in revenue per visitor.
In another scenario, you might acknowledge that whilst a decrease in revenue per visitor is almost unavoidable, you’ll only deem the experiment a failure if revenue decreases below a certain threshold for each visitor.
So the earlier the better?
Either way, thinking about the different scenarios that might arise within your experiment and categorising them as successes or failures allows you to decide on the outcome of your experiment more objectively than if you were to simply start considering this question after you’d conducted the test.
In addition, asking this question before the experiment even begins helps you consider scenarios in more depth. This allows you to speculate on the impact your test will have on different audience segments.
For instance, if you were to create a pop-up telling the customer they can get a 10% discount if they sign up to the newsletter, this is likely to work well on desktop, but you might speculate that a pop-up wouldn’t work as well on mobile because they’re often more intrusive and annoying for end users. Knowing this, you might conduct another experiment to see how an embedded form would work on mobile instead.
As you can see, considering different experiment outcomes earlier in the process can impact the depth and variety of your experiments. This eliminates any “mining” of positive results when analysing data which often sees businesses declaring winners for limited or irrelevant audiences.
2. Gather and compile all your data – both quantitative and qualitative
Before launching head-first into creating a story as to why the experiment results turned out the way they did, we like to take a step back once our experiments are complete and gather all the data we need to fully understand the results.
Data will primarily come from your experimentation or analytics platforms. In most cases, this is usually more effective than using only the former as the latter will help capture any metrics your experimentation platform cannot.
However, in some cases, qualitative research will run parallel to your experiments. This will help you understand the behaviour of users within each A/B/n variation.
Gathering all of this data from these sources in a central space will allow you to gain a broader picture of the way your users behaved within the test. Some results might surprise you, which can be a great opportunity to explore your data further. Who knows, it could even lead to further tests and better results.
Naturally, the results you discover will lead you to formulate “stories” that try to unpick the data and create reasons for why you’re seeing certain results. We encourage you to write these thoughts down and refer back to them when you get to step three.
3. Deriving your “story”
This is where things start to fall into place. This step is almost like piecing together a puzzle. Essentially in the third step, you use your data to understand the behaviour of your customers. Referring back to your hypothesis and the changes you made within each variation is vital. You should also make sure to check the conversion and primary metric against your results.
At this stage, you can look back at your predictions to understand whether they were correct or not. It’s now that you’ll truly be able to understand whether your experiment was a success or failure. Oftentimes, you’ll be able to look at the data and surmise why this is the case.
4. Support your results with common experiment patterns
An experiment can often fall into one of the established experiment patterns. Using these patterns helps support your test result analysis, providing confidence that you have correctly and accurately analysed the experimental data set.
The five most common patterns we see are:
The Outright Winner
This is when you see a positive – and usually significant – increase across all the metrics you tested. For example, an e-commerce website’s funnel would see a positive boost in all parts of the funnel, from acquisition to checkout. This is obviously the ideal scenario for all the experiments you run!
The Outright Loser
This is the opposite of the outright winner. In this scenario, there is a significant decrease across all metrics you tested. Of course, this doesn’t mean your experiment was a complete loss; it would have helped you gain insights about your customers, which could help you test other changes to boost revenue.
The Qualifying Effect
This is often seen when tests eliminate users who never had any real intention of making a purchase. In the qualifying effect scenario, you might see a drop-off of users in earlier stages of the funnel. As there are fewer users with no intention of making a purchase who have progressed further down the funnel, this would boost the conversion rate in the later funnel stages where only sales qualified leads remain. As a result, for this pattern, you would see overall conversion rates increase.
The Clickbait Effect
This trend is often seen when you have encouraged more users to progress further in the journey than typically intended. This is the opposite of the qualifying effect. The clickbait effect pushes more users down the funnel, even though they have no intention of completing the desired action. This means you usually see a massive increase in users at early stages of the funnel, but then a sudden drop later in the funnel as any unqualified leads leave the website.
The Flat Result
This is quite possibly the worst possible pattern you could see. Essentially, a flat result would suggest that your changes had no significant impact on your users. Results are often so scattered that you wouldn’t be able to gain any insights into your users’ behaviour.
5. Challenge your interpretation
Whilst the easiest route is to find ways to interpret your data so it works with your original hypothesis, you should try your best to challenge your analysis and theories to ensure you haven’t misinterpreted the results. Getting your evaluations peer-reviewed can be a great way of gaining a fresh perspective. Someone from your UX or CX department could be best for this as they are usually highly informed about your customers.
Getting a second opinion can reduce cognitive bias and even shed light on different stories or explanations you might not have considered. At CCX, what we find is that this often leads to further questions and therefore more tests.
Of course, we understand that this process isn’t for everyone. If you need any guidance in how best to analyse your experimentations, feel free to reach out on LinkedIn or use the contact form on our website.