Modea Data Analyst Challenge
Modea’s website is key tool in its marketing efforts, both for acquisition and engaging with potential customers. To determine how the company should invest its website marketing efforts, the following questions were addressed:
● Which channels were best at acquiring new visitors?
● Which channels were best at generating engagement?
● How should Modea invest its digital marketing efforts in 2021?
Analysis
To perform the data analysis, I used Microsoft Excel, R and Datawrapper, a online tool for making custom, interactive data visualizations.
New User Acquisition
In Excel, I used pivot tables to determine which website channels were generating the most new users. After aggregating 2019 and 2020 data, I used stacked bar charts to determine which channels were most effective.
Engagement
In R, I used dplyr for data filtering and manipulation and ggplot2 for visualization. The data already measured sessions, bounces, session duration and conversions. I added metrics for pages per session, bounce rate and conversion rates.
I then ranked the channels by metric. For example, the channel with the highest page views earn the highest rank and lowest page views would be lowest rank.
All of those rankings combined were used to generate a ranking score, which is the model I build to determine which channels had the highest engagement.
I wanted to build a model that put equal weight in all of the important metrics of engagement. Most statistical models are based on how one or many variable(s) impact one metric. I.e. how location, square footage, yard size can impact housing costs. But all metrics of engagement are directly related to one another, so those models don’t give the full picture.
By using a ranking system for each channel based on page views, sessions, pages per session, bounce rate, average session duration, case study conversions and contact form conversions, we can use all these metrics to determine how effective each channel’s engagement is.
Engagement Score = (Rank(page views by channel) + Rank(sessions by channel) + … ) ÷ Number of metrics in calculation
Consider the following example:
Engagement Score for organic medium = #1 page views + #1 total sessions + #1 pages per session + #6 bounce rate + #3 average session duration + #3 case conversion rate + #3 contact conversion rate
÷ Number of metrics (7) = 0.843
In this example, the organic medium is pretty good at generating engagement based on how it ranks among other mediums for each metric. (Engagement Scores closer to zero are least effective at generating engagement while those closer to one are best.)
Each channel was assigned an Engagement Score and visualized in a horizontal bar chart using Datawrapper.
When looking at how a particular channel performed in terms of generating and keeping users, there isn’t one metric that outweighs the others. Each step in the process of a conversion relies on the steps before and after.
It’s important for a client to know how many users are coming to the website, how deep they are navigating it, what is causing them to leave, how long they stayed and what led to a conversion. In Modea’s case, different metrics may have more weight depending on which audience it is trying to grow. For potential clients, conversions may be the most important metric. On the other hand, decreasing bounce rate or increasing page views may be more important is it is looking to increase brand recognition or attract future employees. Different types of users will bring very different engagement to a website.
Recommendations & Afterword
I believe the best decisions are rooted in data. Simply put, it makes sense to invest in what is working. It also makes sense to divest in what isn’t working.
Invest in search engines and email campaigns. Search engines are among top sources and organic search is top medium. Also among top 5 for generating new users. SEO often can define company’s brand, who knows about it, and has a huge impact on site engagement. This is the case among all user demographics. Email campaigns rank high in engagement and acquiring new users. Email is highly effective among business professionals, who are a key audience for Modea when it comes to growing its client base.
Divest from sources, campaigns and mediums that are not effective. Thirty-nine sources are tied for lowest engagement. These have hardly any page views, sessions, duration, etc. Unless there is a plan to reinvent how these channels are being used, it probably isn’t worth putting any more resources toward them.
There is a lot of room for growth in social media and email campaigns. Social media is low-ranking in engagement among mediums. Modea posts 1-2 time a month across its social media platforms. Most studies agree businesses need to post at least 2-3 times a week to successfully scale over the platforms. If Modea were to increase its posting frequency, engagement from social media would likely be much more successful.
While some email campaigns are great at getting new users and high engagement, others have the opposite effect. It’s worth diving in to see what the differences are between top performance and low performance campaigns. By identifying these differences, Modea can tweak campaigns to match the right audience and tone to increase engagement.
Challenges
The data collected didn’t have any clear outliers using quartile, Z-score, linear regression or proximity-based models. This doesn’t mean there are no outliers, but it can pose a problem when there is no way to detect other than sifting through rows of data.
As said above, most analysis models are created to find how one or more variables affect a single outcome. By using an engagement score, there is concern that some of the included variables have more correlation to one another than others. For example, someone who visits the website and has a longer session duration likely also has more page views. This could mean that session duration and page views are, for lack of better wording, “double-weighted” in comparison to sessions, which only is impacted by new users. Since the Engagement Score for each channel was based on giving equal weight to each metric, it makes the basic assumption that each is equally important or has an equal effect on the outcome. With deeper client consulting, the model would be adapted to weight each metric more of less based on their goals.