Introduction
Robyn is Meta’s open-source Marketing Mix Modelling solution, and it works by analyzing how different factors, like your ad spend, promotions, or seasonality affect your key performance metric (like sales). It does this using a statistical method called regression. This creates a formula to estimate how much each factor contributes to your results.
With the right data as inout Robyn will help you answer questions such as:
- How much sales does each media channel (online and offline) drive?
- What is the optimal spend for each marketing channel?
- What was the ROI of each marketing channel?
In this post I will dive deeper into how Meta’s Robyn really works, by answering questions I often got from marketeers. This will help you get a better understanding of how you can use Robyn for your MMM projects, and where it may fall short for your specific use case. For the full technical documentation you can best read Meta’s guide to Robyn.
How does an MMM actually work?
All marketing mix models, and thus also Robyn, roughly follow the same flow of information. They all use a combination of data on: external factors, marketing data, and business outcome metrics. After collecting this data a statiscal MMM regression analysis is done after which you will receive an output on the effectiveness of the marketing channel.
The exact MMM methods vary based on the data you have available, as well as the statistical MMM regression you run. Fully understanding the use case for your MMM in your business is thus key for choosing the right MMM solution for you.
How does Robyn determine what variables actually drive your sales?
During the statistical MMM regression analysis, Robyn models how your different marketing channels drive your sales. It does this in a two-step process: feature engineering and running a ridge regression.
During the feature engineering process, Robyn will help you transform the raw data into features that better represent the underlying problem. There are three different feature engineering techniques that Robyn uses which result in the most questions: prophet seasonality decomposition, adstock & saturation.
What is Prophet seasonality decomposition?
Depending on your business your sales and KPIs will likely be impacted by seasonality trends. Robyn uses Prophet to model these time trends and to use them in the final model. Typically you would have to collect this data yourself, but Robyn can do this for you on 4 different levels:
- Trend data (long-term and slowly evolving movement)
- Seasonality (repeating behavior that can be captured in short-term yearly cycles)
- Weekday (repeating behavior that can be captured within a week)
- Holiday/event (Holidays or events that highly impact your sales such as a big sale or promotion)
What is adstock?
Robyn accounts for the carryover effects of marketing activities. For instance, a TV ad doesn’t just drive sales on the day it airs; its impact may linger for days or weeks. Robyn uses an Adstock function to mathematically model this carryover. It assumes that the impact of a marketing activity decays over time but does not disappear entirely right away.
The decay rate is controlled by a parameter called the adstock decay rate (or retention rate) and there are two functions to choose from: geometric and weibull. I think that out of these two functions the weibull function is the best to choose as it allows to model the decay rates for very different marketing channels such as online and offline.
What is saturation?
Robyn accounts for the fact that marketing channels can get saturated and not return the same results for each additional euro you spend. This phenomenon of saturated marketing channels is also known as diminishing returns – and Robyn incorporates this with a Hill function.
The concept of diminishing returns can be visualized as a saturation curve, where:
- The x-axis represents ad spend.
- The y-axis represents business outcomes (e.g., sales, leads).
- The curve starts steep, showing strong returns on initial investments, but flattens as spending increases, indicating diminishing returns.
How can Robyn be used for both digital and traditional marketing channels?
Robyn can be used to model the effects of your offline and online marketing on your business metrics such as sales because the input data is mostly comprised of your advertising spend per channel. Whereas marketing attribution solutions (such as google analytics) attribute conversion based on online tracking data, Robyn uses advertising spend over a longer time period to model how that influences your sales.
The benefit of using this methodology is that you can have both digital and traditional marketing channels – such as television, direct mailings, or radio – in your model. Robyn and all other MMM methodologies are also more privacy-friendly because they don’t rely on tracking people’s online behaviour.
How do I interpret Robyn’s recommendations for budget allocation?
Robyn doesn’t stop at analyzing historical performance—it takes the next step by delivering actionable insights for future planning. Its simulation and recommendation features give you the tools to make data-driven decisions that optimize your marketing strategy. Here’s how Robyn bridges the gap between analysis and action:
What-If Scenario Simulations
Robyn enables marketers to test different budget allocation scenarios and predict potential outcomes. This helps answer strategic questions like:
- What would happen if I increase my TV ad spend by 20% while reducing social media spend by 10%?
- How might reallocating more budget to digital ads during the holiday season affect sales?
By running these simulations, Robyn provides marketers with a clear understanding of how changes in spending might influence results, helping them anticipate the impact of their decisions.
Data-Driven Recommendations
Beyond simulations, Robyn actively generates tailored budget recommendations designed to maximize return on investment (ROI). These suggestions focus on:
- Channel Effectiveness: Identifying and prioritizing channels with the highest ROI potential based on historical data.
- Optimal Spend Levels: Avoiding diminishing returns in oversaturated channels while directing resources toward underutilized opportunities.
- Strategic Balance: Aligning budget distribution with broader marketing objectives, whether the goal is boosting brand awareness, driving lead generation, or increasing sales.
By combining scenario testing with actionable recommendations, Robyn empowers marketers to optimize their budgets effectively and make confident decisions grounded in data. Would you like me to expand this further or include a case study example?
How does Robyn ensure the model’s results are robust and not overfitted?
One major downside of marketing mix modelling is that your model is based on historical data, and that the modelling may be prone to overfitting on your past results and not be able to accurately forecast or predict future results. To prevent this, Robyn makes use of bayesian ridge regression and doing 3 things:
First, Robyn uses gradient descent to iteratively minimize the residual sum of squares (the difference between actual outcomes and predicted outcomes). This ensures that the model learns the most accurate representation of the data’s relationships.
Second Robyn avoids overfitting by using a regularization (part of a ridge regression) which is a method that penalizes overly complex models, ensuring that they remain interpretable and stable.
Third, Robyns statistical MMM modelling technique doesn’t just aim to maximize one outcome (like R-squared); it balances multiple objectives, such as predictive accuracy and interpretability. It employs multi-objective optimization to strike a balance between conflicting goals:
- Model Accuracy: How well the model predicts outcomes based on the data.
- Model Interpretability: Keeping the model simple and understandable for marketers and decision-makers.
- Robustness: Ensuring the model performs well with new or unseen data.
What type of data do I need to get started with Robyn?
Robyn begins its journey with raw data from diverse sources. Depending on your business this includes media spend, impressions, and business outcomes such as sales, leads, or other performance metrics relevant to your business. The goal is to connect your marketing inputs in costs with your desired business outcomes to understand their relationship and effectiveness of each input channel.
External influences, such as seasonality, holidays, economic trends, weather patterns, or competitive activity, can significantly impact business outcomes. Robyn allows you to incorporate these factors into your model, enhancing its predictive power and ensuring it accounts for real-world complexities beyond just marketing spend.
To create a robust and reliable model, Robyn typically requires weekly or daily data spanning at least two years. This ensures enough historical variability for the model to identify meaningful patterns and relationships. Variability in data (e.g., changes in spend or seasonality) is critical for the model to discern the effects of different marketing channels accurately.
What are the downsides of using Robyn?
While Robyn offers a powerful and scalable approach to marketing mix modeling, it’s not without its limitations.
One key downside is that Robyn models coefficients as constant over time. This means the model assumes that the effectiveness of a given marketing channel or activity remains static. However, in the real world, marketing effectiveness can fluctuate due to various factors, such as:
- Saturation effects as campaigns mature.
- Changes in audience behavior over time.
- Shifts in external dynamics like competition or market conditions.
As a result, Robyn’s models may not fully capture the evolving nature of marketing effectiveness, which could lead to less precise recommendations, particularly for long-term strategies.
Also, Robyn’s performance is heavily reliant on having clean, accurate, and sufficient historical data. If your data lacks variability or spans a shorter time period than recommended, the model may produce less reliable results. This dependency on high-quality data can be a barrier for businesses that don’t have robust data collection and management processes in place.