If You Invest in Descriptive Analytics Today, You May Be Wasting Your Money
“All models are wrong; some are useful.”
Anyone who sat through an analytics maturity presentation has seen a version of this image:
In certain contexts, this is a useful mental exercise. It attempts to explain to an overly ambitious executive that before he throws his hat in a ring to compete with Facebook and others in artificial intelligence (AI) and machine learning (ML), he needs to get his house in order, generate those “TPS reports” and get the basic analytics right.
Today, though, this model is not only wrong but also harmful. Companies delay creating a valuable analytics platform often because they think that platform is dependent on having ‘prerequisites’ in place before they begin (such as universally agreed-upon metrics definitions or the best sources of underlying measures). In reality, those prerequisites aren’t needed to create a platform.
Very simply, few enterprises ever find the answer of descriptive analytics (“what happened”) most of the time. It’s difficult to calculate even the most basic common metrics across the organization because the effort is so significant and time consuming. The majority of organizations do it only for a handful of KPIs, and many consulting firms earn their living by telling enterprises what KPIs to track.
Let’s take the metric of customer lifetime value (CLTV) as an example, which relates expected revenue from a customer to the cost of service over the duration of the relationship with this customer. Unsurprisingly, it is one of the most hotly disputed metrics across organizations from SaaS companies to retail chains. How do you allocate the shared service costs? Account for returns? Project customer churn? These and many other terms require commonly accepted interpretation, making the question “what happened” so hard to answer. And this is not a challenge that will just go away.
Don’t Ask Me “Why”
The diagnostic analytics question “why did it happen” is even harder to answer. Not only do you need to agree on “what happened” but you need to also establish a causal relationship between events, which is rarely possible outside of a scientific laboratory. Identifying this relationship requires the following:
- A very comprehensive data set on all possible facts that could have caused the outcome you seek to explain
- All these data points cleared of noise and organized in a way that will make the analysis possible (at a minimum, on the same timeline)
- A cause-and-effect model that would be both reasonably accurate AND time invariant, to avoid re-doing it every time something changes (because it will)
Doesn’t sound very promising? Indeed, because it’s not, as demonstrated by the fact that organizations have spent the last 50-60 years trying to understand causality, such as marketing spend or consumer behavior. If you’ve ever sat in a meeting where a team is trying to get to the root cause of something, you already know how rarely those efforts pay off.
All Signs Point to “Yes”
Now, if you follow that famous maturity model, you might guess that “what will happen” is a harder question to answer than “why did it happen”, right? Wrong! In fact, ML and predictive analytics fair better than diagnostic analytics for a very simple reason: Predictive analytics deal with patterns in a reasonably stable environment, while diagnostic analytics are an attempt to explain an outcome often caused by changing conditions.
And stability is a very important feature. Let’s look at that same CLTV as a metric we want to predict: If the company’s delivery model, spectrum of services and clientele were evolving so quickly that things would be changing in a matter of weeks, a predictive model for CLTV projections would likely not perform well enough to be useful. However, as most businesses (even fast-growing ones!) are relatively stable across the aforesaid dimensions, ML algorithms for predicting a given customer’s LTV can be quite reliable.
Another equally important feature is that predictive analytics don’t really require understanding the root cause for the predicted outcome (i.e., a fixed mental model of the process it operates on). It also doesn’t require either a general consensus on what the input events mean or a clear semantic understanding of the outcomes. Going back to the CLTV example: When building or using an ML model, you don’t need to understand who the customer is, as an individual or entity, or what made this customer spend less or call customer service more in the last month. Your model simply takes several inputs (demographics, purchase/return history, product info, records of web/app sessions, etc.) and then it can predict whether this customer’s LTV will end up being within a certain range, for example.
This is fundamentally different from both “what happened” and “why it happened” ‒ a radical departure from the set of questions we like to ask from our data in what is normally called business analysis.
Where Does This Bring Us?
While general analytical immaturity will certainly make AI and ML more challenging, it is by no means an excuse not to make a big push into developing an effective program for AI transformation. Today, organizations need to spend less time understanding their analytics maturity, and instead focus on building:
- More advanced data management capabilities to support any analytical efforts that are critically important to understanding and communicating how their business is performing
- A scalable program for injecting AI, ML, and advanced analytics into every aspect of their operations, to dramatically augment and scale human decision-making, and support more (and hopefully better) predictions built on more data points.
In doing so, businesses can make more progress in achieving their analytics goals rather than wasting time building traditional maturity models.