Miro sends over 35 million transactional and marketing emails per month to over 70 million users. With its broad scope and reach, email is a good experimentation ground.
Why experiment:
- Solve user problems and improve their experience.
- Drive growth and business impact.
- Optimize content for diverse audience segments.
- Gain insights and learnings.
Why A/B test specifically:
- To remove confirmation and selection bias.
- To avoid seasonal discrepancies.
- To achieve statistical significance which removes doubts from results.
The project team for such experiments includes two marketing managers, a designer, a copywriter, an analyst, a front-end engineer, and a back-end engineer.
The process for A/B testing looks more like a circle than a straight line:
- It often starts with a reason to test — a problem to solve, a new idea, or an insight from a previous experiment.
- Hypothesize solution.
- Build the variant.
- Execute test.
- Analyze results, and back to the reason to test.
Miro has a diverse audience across different team roles (product management, marketing, IT, etc.), plan types (free/paid ones), seniority levels (executives, managers, etc.), and use cases (process mapping, strategy & planning, etc.). Each of these users receives a welcome email which makes it a great candidate for A/B testing.
Welcome email tests are based on two segments: new Creators and new Collaborators. Creators are those who create boards and invite Collaborators. They tend to have specific product use cases while Collaborators are using Miro because of the Creator’s use case.
Many Collaborators don’t do anything in Miro within 24 hours after registration. The Miro team decided to create a welcome email to encourage this user segment’s engagement.
- The first experiment consisted of an optimized CTA placement.
The hypothesis was that a higher CTA placement would increase clicks.
This experiment was successful: the variant saw an 8% increase in new Collaborator activity but also a decrease in new Creator activity, and a 2% CTR uplift. The insight after this experiment was that new Collaborators react positively to a clear CTA higher in the email design.
- The second experiment was conducted with the hero section.
The hypothesis was that a full-width hero design would increase clicks and user engagement on new Collaborators’ part. In the variant, the hero section was expanded and sprinkled with a confetti animation.
The experiment email variant performed worse than the control version with a regular hero. New Collaborators’ activity decreased by 6% while new Creators’ activity actually increased. Every other metric including CTR, increased. Although a failure, it led to an understanding that the Creator and Collaborator segments behave very differently, so a split welcome flow might be effective.
- The third experiment was based on a clearer welcome email structure.
The variant email was split into three steps, and the flow was designed differently for the segments of Collaborators and Creators. The success criteria remained an increase in new Collaborator activity on their first session in Miro.
This experiment was a success with both Collaborator and Creator activity increasing as well as a CTR uplift. The insight was that users responded positively to more personalized and simplified content.
How Miro builds these tests:
- The design system is stored in a self-serve CMS platform.
- 30+ design modules enable the team to build and test at scale without engineering support.
- The CMS is directly integrated with the company’s ESPs of choice.
Occasionally, the team needs to test content that requires more customization. To understand whether this is worthy of engineering resources, the team assesses these changes using an effort vs. impact matrix.
The user journey test is set up using the ESP’s journey tool managing the user flow. The segments are randomized into a 50/50 split which is then broken down into subdivisions for each segmented content.
To conduct transactional email tests, Miro uses Split.io to manage 50/50 audience splits. This service first gets API information from a notification email service, then does the split, then routes the API payload back to the notification email service. This allows for inline testing in real-time in a single transactional template.
The duration of tests depends on the statistical significance, so, essentially, on the sample size. For a larger audience, the run time is shorter, and for a smaller audience, the run time is longer.
When the test achieves statistical significance, the team closes it. Cohorts are analyzed based on activities during the experiment, recommendations are given for the next steps and hypotheses, and the experiment gets cleaned from the template.
- For the fourth experiment, the team incorporated Miro board UI elements in the welcome email.
The variant email was designed to have a grid background and cursor designs on the head module. The success criteria were the same — an increase in new Collaborators’ activity on their first session in the product.
The results again were different for new Collaborators whose activities decreased and new Creators, whose activities actually increased. The CTR slightly decreased while the CTR on the hero CTA increased by 14%. The insight showed that including Miro board elements in the welcome email motivates new Creators but creates friction for new Collaborators.
By that point, the welcome flow was already split, so the experimental version of the email was pushed live for new Creators while new Collaborators continued to receive the original version of that email.
During these four experiments, previous learnings directed further testing:
- The first one showed that users react both positively and negatively to design changes in the hero section.
- The second one foreshadowed the fourth one and how each segment might respond to an over-the-top design.
- The third experiment proved that branching the line of testing between these distinct audience segments is effective.
After these tests, the experimentation at Miro continues.
As open rates and click rates are unreliable, it’s better to find another metric that the email A/B test is trying to improve. In the case of Miro, this was board activity.