Learn / Product Forge / Article
A/B testing in product management: the best A/B testing tools to build better products
As a product manager (PM), your job is to deliver a product users will love. But like most things in life, that's easier said than done.
Best practices and the latest design trends aren’t always going to work for your users. If your goals are to improve retention, provide a better user experience, and increase revenue, you need to A/B test your way to success.
Last updated18 Aug 2022
Reading time19 min
There are a few things you need to make this happen: relevant KPIs, access to the necessary user data, cross-functional communication, and the right set of A/B testing tools.
We'll help you with the latter by showing you how you can use A/B testing tools to develop products that will resonate with users—and how the best A/B testing tools can lead you to your very own ‘Aha!’ moments. We cover:
How A/B testing helps product managers and teams build better products
PMs are constantly under pressure to make the right decisions about building and improving their product. Where can you have the most impact? How does each change and decision help your users? How do product improvements impact your business goals?
A/B testing can help you sort out all these questions by testing and learning at every step of the product lifecycle. That way, you can validate and improve every step in the user journey—from activation and onboarding, to conversion and retention.
What is A/B testing?
In product management, A/B tests (also referred to as split tests) are used to identify the best-performing variation of a product feature or update. An A/B test is a statistically valid way of seeing how good or bad your ideas are and which version or variant will perform better in the market.
💡 Keep in mind: you can test ideas using variations of measurable elements of a product, app, or website that can affect a user’s decision to convert.
The point is to see which variation leads to your desired outcome.
The idea could be anything—a landing page, a logo, a color choice, an in-app feature update, the entire user interface (UI), or your product messaging. The desired outcome might be more user engagement, an increase in conversion rate, better user retention, and so on.
In an A/B test, you release two different versions of a feature to random sets of users and then measure what those users do relative to each other. Then, the test uses statistical analysis to determine the version that performs more effectively.
A/B testing is an example of statistical hypothesis testing: you hypothesize about the relationship between two data sets, which are then compared against each other to determine if one shows a statistically significant improvement over the other.
The point of an A/B test is not just to determine how two variations of a product perform. It’s to see how a specific idea will ultimately perform with your entire audience.
Say you want to know how your next 100,000 users might react to your latest feature update. Waiting for and observing all 100k users using the feature isn’t an option—by the time you're done, it'll be too late to optimize their user experience.
What you can do is see how the next 1,000 users react during an A/B test, then use statistical analysis to predict how the other 99,000 will behave. Proper A/B testing can help you make that prediction with incredible accuracy, which lets you optimize how you interact with those 99,000 users. This is why A/B testing can be so valuable in product development.
Why is A/B testing important for product managers?
A/B testing helps your product, website, or app become its best self.
Building the best version of a product takes a lot of decisions, iterations, measurements, and data. It also includes making important changes that impact business performance. A/B testing ultimately takes the risk and guesswork out of making these changes.
With A/B testing, a product manager can test two versions of a new feature, layout, or another product element against each other, by releasing them to a randomly selected segment of their user base—and learning which version users respond to.
You can test individual features to see the effects of changing them without making any permanent commitments, allowing you to predict long-term outcomes and plan your resources accordingly.
As a PM, A/B testing helps you validate ideas and empower decision-making across the company by allowing you to:
Know if you’re building the right product: get early product feedback from your audience that helps define your roadmap.
Confidently launch new features: test features as rollouts to validate your hypothesis and monitor your key metrics.
Enhance UX across product divisions: test APIs, microservices, clusters, and architecture designs to improve performance and reliability, from mobile devices and apps to IoT and conversational interfaces.
Experiment with price and sorting algorithms: optimize your search results, promotions, and other algorithms to deliver the most relevant content to your users.
Prove the impact of every product release: validate the impact of new features, performance improvements, or backend changes before rolling out to everyone.
A/B testing helps your product, website, or app become its best self.
A/B testing can help you understand the features and elements that most improve the user experience. Sometimes, even minor changes can have a huge impact. This knowledge will help your team make data-informed design decisions and be more specific in conversations with stakeholders by using phrases like “we know” instead of “we think.”
💡 Keep in mind: over time, a product team that continuously uses A/B tests to measure the impact and effectiveness of each product element will be able to build a product that resonates with the company’s user persona.
Here are a few other common PM issues that proper A/B testing can help you solve:
Sorting through too much data: information overload makes it difficult for you and your team to identify the best place to start. A/B testing lets you focus on key metrics and areas of usage tracking to track your product as it scales.
Relying too much on your gut: product improvements should be based on data, not gut instinct. When you’re running A/B tests regularly, you can measure your team’s theories against real-world data and develop better products.
Not measuring in the right places: if you aren’t collecting and measuring data in the right pages, interfaces, or features, you’re missing out on experimentation insight from your users. A/B testing lets you perform multiple tests in areas that matter (i.e. high-converting landing pages, app features aimed at user retention, drop-off points in your product) to get valuable and relevant insights to make a better product for your users.
Placing too much focus on data: numbers don’t tell the whole story. If you and your team choose to only rely on analytics and skip qualitative insights, you’ll miss out on crucial information that numbers might not tell you—like how your users feel and why they act the way they do. Qualitative A/B testing is an invaluable and often underused process.
Types of A/B testing tools
A/B testing tools let you experiment with product changes and updates. During a test, you can track user behavior and collect product experience (PX) insights to understand how each variation affects the user experience within your product.
A/B testing tools are particularly useful in helping a PM define key metrics, decide how long to run the test, segment which users will see each variant, and determine when statistical significance is reached.
The best A/B testing tools can help you build, scale, and accelerate your experimentation program. Successful product teams use two types of tools as the foundation for feature ideation and development: quantitative tools based on hard, numerical data; and qualitative tools that focus on user behavior and product experience (PX).
Quantitative, data-driven A/B testing tools
Quantitative A/B testing tools focus on hard data and statistically relevant insights about users' actions during an A/B test.
These types of tools are most useful to gather data points in measurable, numerical form—like the number of users who used your product, app, or website; the percentage of users who converted into leads or sales; or the number of times a button was clicked.
Quantitative A/B testing tools track how users use your product in each test variant. After testing enough users to have a statistically significant sample size*, you’ll know which version performs better and should be rolled out to the rest of your audience.
*Note: in the context of A/B testing experiments, statistical significance is used to measure the level of uncertainty in your data. Simply put, it measures how likely it is that the difference between your experiment’s control version and test version isn’t due to error or random chance. For example, if you run a test with a 99% significance level, you can be 99% sure that the differences are real, and not an error caused by randomness.
As a PM, statistical significance gives you the confidence to make changes based on the results of experiments that you’re running and know that these product changes will have a positive impact.
Here are three A/B testing tools that will help you gather quantitative user data to build a better product:
1. Google Optimize
What it is: Google Optimize is the best free option for your A/B testing needs—if you have engineers on board who know how to set up advanced tracking. Its key features include A/B/n testing for multiple versions of a page, multivariate testing for multiple elements on the same page, and server-side experiments.
How to use it: use Google Analytics to identify elements of your product that need improvement and develop hypotheses. Then, use Google Optimize to test changes and see if they make a difference to the metric you’re analyzing. Once you’ve analyzed your test results, launch the winning variation directly from Google Optimize.
Used together with Google Analytics, this combo can be one of the best data-collecting and A/B testing solutions around, and you can get just about everything you need for free. However, to get the best out of it, you need to truly know the platform, which might be a more technical process than you’re looking for.
🔥 If you're using Hotjar
Hotjar’s integration with Google Optimize makes it easy to explore and compare differences in how your users experience and interact with your test variants. The integration workflow is ideal for empathizing with your users: find out how easily they can complete their objectives, and draw better conclusions about your tests.
What it is: VWO is a conversion optimization platform with a comprehensive suite of CRO tools, which includes advanced A/B testing capabilities. It’s known for its extensive testing tools for A/B, multivariate, and split URL experiments with behavioral segmentation, and an asynchronous code that reduces loading times while tests are running.
How to use it: use VWO’s behavioral segmentation feature to test complex interactions with your users. Add 'visitor behavior' as a condition for your visitor segments to execute a test when your visitors perform an action like clicking the Product Demo button, or after scrolling 50% of the page. You can even test dynamic landing pages that highlight more relevant content to users by interpreting their intention in real-time using behavioral targeting.
Your dev team is bound to love VWO’s asynchronous coding, which improves A/B testing performance to avoid slow loading times and flickering, which are common with a lot of A/B testing software.
What it is: Omniconvert is a CRO tool for developers, startups, and ecommerce businesses that includes specialist A/B testing features with advanced segmentation, exit-intent overlays, and API access.
How to use it: run unlimited tests with specialist features to see how your users interact with your product. Use the unlimited CSS and JS editor for complete control over variation code. Your dev team can optimize their code to keep loading times as fast as possible while tests are running.
Omniconvert's conversion optimization platform, Explore—which they call “the CRO tool for developers”—also comes with a CDN cache bypass, which means you can ensure your tests run live immediately, instead of some users seeing older, cached versions of your site.
🔥 If you're using Hotjar
Use rage clicks as a filter in Session Recordings to pinpoint the moments when a user repeatedly clicks on an element or area of your site, which can help you identify blockers or friction points in the user experience. Use what you learn to prioritize changes to optimize UX and improve the customer experience.
Bonus reading: learn how to analyze rage clicks to understand customer behavior and improve the user experience.
Qualitative, behavioral A/B testing tools
Qualitative A/B testing tools rely on user experience, user behavior, and product experience insights to explain why users take specific actions.
These tools can and should be used alongside quantitative A/B testing tools to expand the potential of what A/B testing should deliver: real value for your product.
Qualitative tools are most useful when you need to achieve advanced insights from A/B testing. Say one of your test variations shows a higher bounce rate for a product page. Your first instinct may be to assume that the images you’re using aren’t very interesting—when in fact, the bounce rate could be high for an entirely different reason. Maybe your tracking isn’t working well, the page has fewer internal links, or the headline is confusing. So how do you figure it out? By asking your users for feedback.
Product experience (PX) insights tools can be extremely valuable for product teams because they help you understand why and how some aspects of the user experience impact user behavior. They help you explore more in-depth ideas, learn about common pain points of your buyers, or which product features are most interesting to them. The answers describe the frustrations and desires of your target audience.
When it comes to getting the full picture, quantitative and qualitative A/B testing tools should work in harmony: quantitative data answers the what, and qualitative data tells you why.
By collecting and analyzing qualitative data, you can add an instant boost to your A/B testing workflow:
Update your product without negatively impacting numbers: test entirely new site or app designs before you launch them, without any performance impact to your customers.
Streamline your product by measuring feature usage: figure out how many users are really using a particular feature and how that affects their behavior overall, especially new users.
See why experiment results don’t translate directly to real-world impact: understanding your users' reasoning can help you see how different your experiment setup (or sample) is, compared to the real world (or population).
Here are three qualitative A/B testing tools to help you connect the dots between what's happening and why:
What it is: Hotjar gives you product experience insights that help you empathize with and understand your users. Get a complete picture of the user journey, understand UX pain points, and identify solutions you need to implement.
How to use it: A/B tests help you quickly identify what your users like and dislike, but they lack context—they tell you what does or doesn't work, but not why. Fill in the gaps with qualitative product experience insights, which let you hear directly from your users so you can completely understand how they experience your product.
Finding that best-performing variation after an A/B test can be such a rush. Before you go out celebrating, you’ll want to figure out a couple of last (but very important) things: why does one version work better than the other? What can you learn from failed variations, and how can you replicate the success of the winner?
A/B testing is essential in product management because it challenges assumptions and helps PMs make decisions based on data, rather than on their gut. But even the best-designed A/B tests can’t tell you why one variation outperformed another. That's where Hotjar comes in.
Hotjar helps you generate the qualitative insights that A/B testing alone can’t provide:
1) Use Hotjar’s Session Recordings and Heatmaps to visualize how users react to both A/B test variations of a specific page, feature, or iteration. Analyze the winning and losing versions and look for clues: what are visitors drawn to? What confuses them? What’s the final thing they do before they convert—or leave?
2) Hotjar’s Surveys take the guesswork out of understanding user behavior during A/B testing by asking questions and getting answers directly from users, in their own words. Find out exactly what they need from or think about each variation, then compare answers, look for themes, and identify the reasons behind the success of one variation over another.
3) Need more on-the-spot feedback? Hotjar’s Incoming Feedback widget gives you instant visual feedback from real users that give you a detailed picture of how they feel about your product. They can rate their experience on a scale, provide context and details, and even screenshot a specific page element before and after you make a change.
What it is: Usersnap is a versatile user feedback platform that offers a single hub for customer requests, bug screenshots, and user experience ratings.
How to use it: Usersnap turns customer feedback into business opportunities. Its clear bug reports help you resolve issues faster, and screen recordings and screenshots make it easy to see what goes wrong for users. You can also follow up and improve iteratively. Collect visitor feedback by asking your visitors to:
Fill out an in-app form
Rate their experience
Report an issue by screenshotting a specific part of the page
A feedback tool like Usersnap helps you set up and streamline the process of filing and evaluating ideas or issues. With this evidence, you can shape your product roadmap based on clear customer facts.
What it is: Crazy Egg is a testing platform designed to help businesses track, evaluate, and improve their conversion rates using A/B testing, heatmaps, and website recordings.
How to use it: use Crazy Egg's A/B testing tool to test ideas and variants for updating your website (like variations of your homepage, landing pages, and product pages) against each other. This will help you find the winner—the one that results in higher conversion rates.
You can use Crazy Egg's recordings tool to watch recordings of real visitors interact with test variations, so you can understand what makes them take action or become frustrated. You won’t be able to collect direct user and customer feedback like you can with Hotjar's Feedback Surveys, but you can review recordings of sessions where visitors experienced issues and pain points and use that knowledge to iterate on future variations.
Note: for a detailed side-by-side comparison of the tools—so you can make the best choice for your product team—check out our handy Hotjar vs. Crazy Egg page.
5 common A/B testing challenges and the tools to overcome them
A/B testing can be a huge advantage for product teams: it helps you direct your efforts to the most valuable elements by pinpointing exact problem areas. But here are five challenges you may experience in your A/B testing workflow and how to solve them:
1. Deciding what and where to test
Problem: having too many options makes it challenging to know where to start testing. Just because some minor changes are fast and easy to implement doesn’t mean they’ll also be significant in terms of business goals.
Solution: your existing user data can help you find problem areas. These could be product features, pages, or assets with low conversion or high drop-off rates.
If you aren’t collecting and analyzing data in the right places, you’re missing out on experimentation insight from your users. Sure, you could base A/B testing on gut instinct and opinion: but tests based on objective data and in-depth hypotheses are more likely to gain valid results.
Overcome the challenge of not knowing what to test by using data from tools like Google Analytics: look at elements that are having the most (or least) impact on your product conversion rates, or pages with the highest (or lowest) amounts of traffic, then test variations of those elements or pages to improve results.
Benefit: product development is about much more than random changes. It’s about doing the user research to understand what your customers need, what makes them hesitate, and what they think about your product and site experience. Once you have those answers, you’ll have a better idea of what to test.
2. Knowing what to test first
Problem: in A/B testing environments, teams usually test twice as many things as they end up shipping. That's time-consuming and can lead to internal frustrations.
Solution: the best thing a product manager can do for their team during A/B testing is to remove ambiguity and help clear the path for the team to work.
If you want to push your product and work on ideas that are so creative and challenging that many of them will fail, the best thing you can do for your team’s velocity is to validate assumptions earlier by building out just enough to validate the next step, instead of building out the entire user experience.
To maintain development process speed, use tools like Hotjar’s Session Recordings to identify opportunities for testing. Where are your users clicking? Which sections are they spending a lot of time on? What’s getting them stuck, confused, and frustrated? What are they skipping over?
Use recordings insights to develop hypotheses and run A/B tests that will help you make data-driven decisions and showcase the effectiveness of your work.
Benefit: knowing what to test first means more tests will deliver money-making improvements, and you’ll make the highest-value changes first. Effective measurement in these areas will also help your team avoid burnout.
3. Locking in on sample size
Problem: chasing quick results and calling inconclusive results too fast can lead to statistically irrelevant results and a failed A/B test.
Solution: run the test over a pre-set time scale. Don’t cut it short if you see results earlier than planned; the right time to end an experiment ultimately depends on your product and traffic.
Use an A/B test sample size calculator to define your ideal sample size. This one from Optimizely lets you calculate the sample size you'll need, on average, for each variation in your test to measure the desired change in your conversion rate.
Learn more about how you can use Hotjar with Optimizely to gain a better understanding of your A/B test.
Benefit: to get results that will apply to your audience as a whole, stick to a statistically relevant sample size. The sample size needed usually depends on the change you expect to see. If you hypothesize a more substantial change, you’ll need fewer users to confirm it.
4. Correctly implementing A/B tests
Problem: correctly implementing A/B tests can be tricky—especially for non-tech PMs—and can lead to issues like flashing of original content, misalignment, overwriting with code and integrations, and much more. Solution: use a tool that specializes in A/B testing to implement your tests. Then, use a different tool to check the implementation.
For example, Omniconvert lets you run unlimited A/B tests at any stage in the development process so you can learn how users interact with your product—including design, calls to action (CTA), and text. You can segment your audience based on behavior, traffic source, location, UTM parameters, and more.
Then, use Hotjar to make sure you have the correct implementation and can deliver a clean A/B test without any loopholes. To do that, just look at Hotjar’s Session Recordings and Heatmaps on the pages or elements you’re testing.
Benefit: using the right combination of tools during your A/B test can help you avoid running ineffective tests that waste time and money.
5. Analyzing test results
Problem: A/B testing provides a lot of quantitative data, but may not reveal the reasons why your users behave the way they do.
Solution: get more from your A/B test results with qualitative insights.
Use an A/B testing tool with a rich reporting feature like Convert to analyze your data and compare KPIs like CTR, AOV, RPV, and ROI. Once your experiment is done, you can easily slice and dice your test data by identifying audience segments like new vs returning users, browsers and devices used, campaigns clicked, and resident countries.
Next, use a tool like Hotjar’s Incoming Feedback widget to combine those metrics with qualitative feedback, so you can connect the dots and understand why your users behave the way they do.
Ask users about their experience with the winning variation: what did they like about it? What would they want to improve? This removes the guesswork, helps you better understand why one variation outperformed another, and will help with further iterations of your product.
Keep in mind that on your A/B testing journey, you’ll see both favorable and less favorable results, which applies to both successful and failed tests:
Successful tests: say more than one of your tests gets statistically significant positive results, and you decide to deploy them. What now? Interpreting test results after they conclude is crucial to understanding why the test succeeded. A fundamental question to be asked is why Why did users behave the way they did? Why did they react a certain way with one version and not with the others? What user insights did you gather, and how can you use them?
Failed tests: most people have a hard time dealing with failure. But just because a test failed doesn’t mean you should ignore it—in fact, it’s the exact opposite. Study the data gathered during the test to understand what made this particular A/B test fail and how you can avoid that during future ones. No failed test is unsuccessful unless you fail to learn from it.
Benefit: poor interpretation of results can lead to bad decisions and negatively affect other developments you integrate this data into. Combining quantitative data with qualitative feedback lets you get more out of your A/B tests and build a better product.
🔥 If you're using Hotjar
Take the optimization of your product a step further by combining quantitative and qualitative A/B testing with Convert’s Hotjar integration. Build on the insights mined by Convert and generate bigger conversion lifts by validating ideas through Convert Experiences.