Making Sense of KPI Modifications | In direction of Knowledge Science -

, we’re normally monitoring metrics. Very often, metrics change. And after they do, it’s our job to determine what’s occurring: why did the conversion price immediately drop, or what’s driving constant income progress?

I began my journey in knowledge analytics as a Kpi analyst. For nearly three years, I’d been doing root trigger evaluation and KPI deep dives almost full-time. Even after transferring to product analytics, I’m nonetheless frequently investigating the KPI shifts. You might say I’ve develop into fairly the skilled analytics detective.

The cornerstone of Root Trigger Evaluation is normally slicing and dicing the info. Most frequently, determining what segments are driving the change provides you with a clue to the basis causes. So, on this article, I wish to share a framework for estimating how completely different segments contribute to modifications in your key metric. We are going to put collectively a set of capabilities to slice and cube our knowledge and determine the principle drivers behind the metric’s modifications.

Nevertheless, in actual life, earlier than leaping into knowledge crunching, it’s necessary to grasp the context:

Is the info full, and may we examine current durations to earlier ones?
Are there any long-term developments and recognized seasonal results we’ve seen previously?
Have we launched something not too long ago, or are we conscious of any exterior occasions affecting our metrics, resembling a competitor’s advertising and marketing marketing campaign or forex fluctuations?

I’ve mentioned such nuances in additional element in my earlier article, “Root Trigger Evaluation 101”.

KPI change framework

We encounter completely different metrics, and analysing their modifications requires completely different approaches. Let’s begin by defining the 2 sorts of metrics we might be working with:

Easy metrics signify a single measure, for instance, complete income or the variety of lively customers. Regardless of their simplicity, they’re typically utilized in product analytics. One of many frequent examples is the North Star metrics. Good North Star metric estimates the full worth acquired by clients. For instance, AirBnB may use nights booked, and WhatsApp may observe messages despatched. Each are easy metrics.

You’ll be able to study extra about North Star Metrics from the Amplitude Playbook.

Nevertheless, we are able to’t keep away from utilizing compound or ratio metrics, like conversion price or common income per consumer (ARPU). Such metrics assist us observe our product efficiency extra exactly and isolate the affect of particular modifications. For instance, think about your group is engaged on enhancing the registration web page. They’ll probably observe the variety of registered clients as their major KPI, however it is likely to be extremely affected by exterior elements (i.e., a advertising and marketing marketing campaign driving extra site visitors). A greater metric for this case could be a conversion price from touchdown on a registration web page to finishing it.

We are going to use a fictional instance to learn to method root trigger evaluation for various kinds of metrics. Think about we’re engaged on an e-commerce product, and our group is concentrated on two fundamental KPIs:

complete income (a easy metric),
conversion to buy — the ratio of customers who made a purchase order to the full variety of customers (a ratio metric).

We are going to use artificial datasets to have a look at attainable eventualities of metrics’ modifications. Now it’s time to maneuver on and see what’s occurring with the income.

Evaluation: easy metrics

Let’s begin easy and dig into the income modifications. As ordinary, step one is to load a dataset. Our knowledge has two dimensions: nation and maturity (whether or not a buyer is new or current). Moreover, we’ve three completely different eventualities to check our framework beneath varied circumstances.

import pandas as pd
df = pd.read_csv('absolute_metrics_example.csv', sep = 't')
df.head()

The principle objective of our evaluation is to find out how every phase contributes to the change in our top-line metric. Let’s break it down. We are going to write a bunch of formulation. However don’t fear, it received’t require any information past primary arithmetic.

To start with, it’s useful to see how the metric modified in every phase, each in absolute and relative numbers.

[textbf{difference}^{textsf{i}} = textbf{metric}_{textsf{before}}^textsf{i} – textbf{metric}_{textsf{after}}^textsf{i}
textbf{difference_rate}^{textsf{i}} = frac{textbf{difference}^{textsf{i}}}{textbf{metric}_{textsf{before}}^textsf{i}}]

The following step is to have a look at it holistically and see how every phase contributed to the general change within the metric. We are going to calculate the affect because the share of the full distinction.

[textbf{impact}^{textsf{i}} = frac{textbf{difference}^{textsf{i}}}{sum_{textsf{i}}{textbf{difference}^{textsf{i}}}}]

That already offers us some beneficial insights. Nevertheless, to grasp whether or not any phase is behaving unusually and requires particular consideration, it’s helpful to match the phase’s contribution to the metric change with its preliminary share of the metric.

Right here’s the reasoning. If the phase makes up 90% of our metric, then it’s anticipated for it to contribute 85–95% of the change. But when a phase that accounts for under 10% finally ends up contributing 90% of the change, that’s undoubtedly an anomaly.

To calculate it, we are going to merely normalise every phase’s contribution to the metric by the preliminary phase measurement.

[textbf{segment_share}_{textsf{before}}^textsf{i} = frac{textbf{metric}_{textsf{before}}^textsf{i}}{sum_{textsf{i}}{textbf{metric}_{textsf{before}}^textsf{i}}}
textbf{impact_normalised}^textsf{i} = frac{textbf{impact}^{textsf{i}}}{textbf{segment_share}_{textsf{before}}^textsf{i}}]

That’s it for the formulation. Now, let’s write the code and see this method in observe. It is going to be simpler to grasp the way it works by sensible examples.

def calculate_simple_growth_metrics(stats_df):
  # Calculating general stats
  earlier than = stats_df.earlier than.sum()
  after = stats_df.after.sum()
  print('Metric change: %.2f -> %.2f (%.2f%%)' % (earlier than, after, 100*(after - earlier than)/earlier than))

  # Estimating affect of every phase
  stats_df['difference'] = stats_df.after - stats_df.earlier than
  stats_df['difference_rate'] = (100*stats_df.distinction/stats_df.earlier than)
    .map(lambda x: spherical(x, 2))
  stats_df['impact'] = (100*stats_df.distinction / stats_df.distinction.sum())
    .map(lambda x: spherical(x, 2))
  stats_df['segment_share_before'] = (100* stats_df.earlier than / stats_df.earlier than.sum())
    .map(lambda x: spherical(x, 2))
  stats_df['impact_norm'] = (stats_df.affect/stats_df.segment_share_before)
    .map(lambda x: spherical(x, 2))

  # Creating visualisations
  create_parallel_coordinates_chart(stats_df.reset_index(), stats_df.index.identify)
  create_share_vs_impact_chart(stats_df.reset_index(), stats_df.index.identify, 'segment_share_before', 'affect')
  
  return stats_df.sort_values('impact_norm', ascending = False)

I consider that visualisations are an important a part of any knowledge storytelling as visualisations assist viewers grasp insights extra rapidly and intuitively. That’s why I’ve included a few charts in our operate:

A parallel coordinates chart to indicate how the metric modified in every slice — this visualisation will assist us see essentially the most vital drivers in absolute phrases.
A scatter plot to match every phase’s affect on the KPI with the phase’s preliminary measurement. This chart helps spot anomalies — segments whose affect on the KPI is disproportionately giant or small.

You’ll find the entire code for the visualisations on GitHub.

Now that we’ve all of the instruments in place to analyse income knowledge, let’s see how our framework performs in numerous eventualities.

Situation 1: Income dropped equally throughout all segments

Let’s begin with the primary state of affairs. The evaluation could be very easy — we simply must name the operate outlined above.

calculate_simple_growth_metrics(
  df.groupby('nation')[['revenue_before', 'revenue_after_scenario_1']].sum()
    .sort_values('revenue_before', ascending = False).rename(
        columns = {'revenue_after_scenario_1': 'after', 
          'revenue_before': 'earlier than'}
    )
)

Within the output, we are going to get a desk with detailed stats.

Nevertheless, in my view, visualisations are extra informative. It’s apparent that income dropped by 30–40% in all international locations, and there aren’t any anomalies.

Situation 2: A number of segments drove the change

Let’s take a look at one other state of affairs by calling the identical operate.

calculate_simple_growth_metrics(
  df.groupby('nation')[['revenue_before', 'revenue_after_scenario_2']].sum()
    .sort_values('revenue_before', ascending = False).rename(
        columns = {'revenue_after_scenario_2': 'after', 
          'revenue_before': 'earlier than'}
    )
)

We will see the most important drop in each absolute and relative numbers in France. It’s undoubtedly an anomaly because it accounts for 99.9% of the full metric change. We will simply spot this in our visualisations.

Additionally, it’s value going again to the primary instance. We appeared on the metric break up by nation and located no particular segments driving modifications. However digging a little bit bit deeper may assist us perceive what’s occurring. Let’s strive including one other layer and have a look at nation and maturity.

df['segment'] = df.nation + ' - ' + df.maturity 
calculate_simple_growth_metrics(
    df.groupby(['segment'])[['revenue_before', 'revenue_after_scenario_1']].sum()
        .sort_values('revenue_before', ascending = False).rename(
            columns = {'revenue_after_scenario_1': 'after', 'revenue_before': 'earlier than'}
        )
)

Now, we are able to see that the change is generally pushed by new customers throughout the international locations. These charts clearly spotlight points with the brand new buyer expertise and offer you a transparent route for additional investigation.

Situation 3: Quantity shifting between segments

Lastly, let’s discover the final state of affairs for income.

calculate_simple_growth_metrics(
    df.groupby(['segment'])[['revenue_before', 'revenue_after_scenario_3']].sum()
        .sort_values('revenue_before', ascending = False).rename(
            columns = {'revenue_after_scenario_3': 'after', 'revenue_before': 'earlier than'}
        )
)

We will clearly see that France is the most important anomaly — income in France has dropped, and this variation is correlated with the top-line income drop. Nevertheless, there’s one other excellent phase — Spain. In Spain, income has elevated considerably.

This sample raises a suspicion that a number of the income from France might need shifted to Spain. Nevertheless, we nonetheless see a decline within the top-line metric, so it’s value additional investigation. Virtually, this case might be brought on by knowledge points, logging errors or service unavailability in some areas (so clients have to make use of VPNs and seem with a distinct nation in our logs).

We’ve checked out a bunch of various examples, and our framework helped us discover the principle drivers of change. I hope it’s now clear methods to conduct root trigger evaluation with easy metrics, and we’re prepared to maneuver on to ratio metrics.

Evaluation: ratio metrics

Product metrics are sometimes ratios like common income per buyer or conversion. Let’s see how we are able to break down modifications in this kind of metrics. In our case, we are going to have a look at conversion.

There are two sorts of results to contemplate when analysing ratio metrics:

Change inside a phase, for instance, if buyer conversion in France drops, the general conversion may also drop.
Change within the combine, for instance, if the share of latest clients will increase, and new customers usually convert at a decrease price, this shift within the combine may also result in a drop within the general conversion price.

To grasp what’s occurring, we’d like to have the ability to distinguish these results. As soon as once more, we are going to write a bunch of formulation to interrupt down and quantify every kind of affect.

Let’s begin by defining some helpful variables.

[
textbf{c}_{textsf{before}}^{textsf{i}}, textbf{c}_{textsf{after}}^{textsf{i}} – textsf{converted users}
textbf{C}_{textsf{before}}^{textsf{total}} = sum_{textsf{i}}{textbf{c}_{textsf{before}}^{textsf{i}}}
textbf{C}_{textsf{after}}^{textsf{total}} = sum_{textsf{i}}{textbf{c}_{textsf{after}}^{textsf{i}}}
textbf{t}_{textsf{before}}^{textsf{i}}, textbf{t}_{textsf{after}}^{textsf{i}} – textsf{total users}
textbf{T}_{textsf{before}}^{textsf{total}} = sum_{textsf{i}}{textbf{t}_{textsf{before}}^{textsf{i}}}
textbf{T}_{textsf{after}}^{textsf{total}} = sum_{textsf{i}}{textbf{t}_{textsf{after}}^{textsf{i}}}
]

Subsequent, let’s speak in regards to the affect of the change in combine. To isolate this impact, we are going to estimate how the general conversion price would change if conversion charges inside all segments remained fixed, and absolutely the numbers for each transformed and complete customers in all different segments stayed fastened. The one variables we are going to change are the full and transformed variety of customers in phase i. We are going to regulate it to mirror its new share within the general inhabitants.

Let’s begin by calculating how the full variety of customers in our phase wants to vary to match the goal phase share.

[
frac{textbf{t}_{textsf{after}}^{textsf{i}}}{textbf{T}_{textsf{after}}^{textsf{total}}} = frac{textbf{t}_{textsf{before}}^{textsf{i}} + deltatextbf{t}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}}+ deltatextbf{t}^{textsf{i}}}
deltatextbf{t}^{textsf{i}} = frac{textbf{T}_{textsf{before}}^{textsf{total}} * textbf{t}_{textsf{after}}^{textsf{i}} – textbf{T}_{textsf{after}}^{textsf{total}} * textbf{t}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{after}}^{textsf{total}} – textbf{t}_{textsf{after}}^{textsf{i}}}
]

Now, we are able to estimate the change in combine affect utilizing the next formulation.

[
textbf{change in mix impact} = frac{textbf{C}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}} * frac{textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{t}_{textsf{before}}^{textsf{i}}}}{textbf{T}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}}} – frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}}
]

The following step is to estimate the affect of the conversion price change inside phase i. To isolate this impact, we are going to maintain the full variety of clients and transformed clients in all different segments fastened. We are going to solely change the variety of transformed customers in phase i to match the goal conversion price at a brand new level.

[
textbf{change within segment impact} = frac{textbf{C}_{textsf{before}}^{textsf{total}} + textbf{t}_{textsf{before}}^{textsf{i}} * frac{textbf{c}_{textsf{after}}^{textsf{i}}}{textbf{t}_{textsf{after}}^{textsf{i}}} – textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}}} – frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}} = frac{textbf{t}_{textsf{before}}^{textsf{i}} * textbf{c}_{textsf{after}}^{textsf{i}} – textbf{t}_{textsf{after}}^{textsf{i}} * textbf{c}_{textsf{before}}^{textsf{i}}}{textbf{T}_{textsf{before}}^{textsf{total}} * textbf{t}_{textsf{after}}^{textsf{i}}}
]

We will’t merely sum the various kinds of results as a result of their relationship shouldn’t be linear. That’s why we additionally must estimate the mixed affect for the phase. This can mix the 2 formulation above, assuming that we are going to match each the brand new conversion price inside phase i and the brand new phase share.

[
textbf{total segment change} = frac{textbf{C}_{textsf{before}}^{textsf{total}} – textbf{c}_{textsf{before}}^{textsf{i}} + (textbf{t}_{textsf{before}}^{textsf{i}} + deltatextbf{t}^{textsf{i}}) * frac{textbf{c}_{textsf{after}}^{textsf{i}}}{textbf{t}_{textsf{after}}^{textsf{i}}}}{textbf{T}_{textsf{before}}^{textsf{total}} + deltatextbf{t}^{textsf{i}}} – frac{textbf{C}_{textsf{before}}^{textsf{total}}}{textbf{T}_{textsf{before}}^{textsf{total}}}
]

It’s value noting that these impact estimations are usually not 100% correct (i.e. we are able to’t sum them up instantly). Nevertheless, they’re exact sufficient to make selections and determine the principle drivers of the change.

The following step is to place the whole lot into code. We are going to once more leverage visualisations: correlation and parallel coordinates charts that we’ve already used for easy metrics, together with a few waterfall charts to interrupt down affect by segments.

def calculate_conversion_effects(df, dimension, numerator_field1, denominator_field1, 
                       numerator_field2, denominator_field2):
  cmp_df = df.groupby(dimension)[[numerator_field1, denominator_field1, numerator_field2, denominator_field2]].sum()
  cmp_df = cmp_df.rename(columns = {
      numerator_field1: 'c1', 
      numerator_field2: 'c2',
      denominator_field1: 't1', 
      denominator_field2: 't2'
  })
    
  cmp_df['conversion_before'] = cmp_df['c1']/cmp_df['t1']
  cmp_df['conversion_after'] = cmp_df['c2']/cmp_df['t2']
  
  C1 = cmp_df['c1'].sum()
  T1 = cmp_df['t1'].sum()
  C2 = cmp_df['c2'].sum()
  T2 = cmp_df['t2'].sum()

  print('conversion earlier than = %.2f' % (100*C1/T1))
  print('conversion after = %.2f' % (100*C2/T2))
  print('complete conversion change = %.2f' % (100*(C2/T2 - C1/T1)))
  
  cmp_df['dt'] = (T1*cmp_df.t2 - T2*cmp_df.t1)/(T2 - cmp_df.t2)
  cmp_df['total_effect'] = (C1 - cmp_df.c1 + (cmp_df.t1 + cmp_df.dt)*cmp_df.conversion_after)/(T1 + cmp_df.dt) - C1/T1
  cmp_df['mix_change_effect'] = (C1 + cmp_df.dt*cmp_df.conversion_before)/(T1 + cmp_df.dt) - C1/T1
  cmp_df['conversion_change_effect'] = (cmp_df.t1*cmp_df.c2 - cmp_df.t2*cmp_df.c1)/(T1 * cmp_df.t2)
  
  for col in ['total_effect', 'mix_change_effect', 'conversion_change_effect', 'conversion_before', 'conversion_after']:
      cmp_df[col] = 100*cmp_df[col]
        
  cmp_df['conversion_diff'] = cmp_df.conversion_after - cmp_df.conversion_before
  cmp_df['before_segment_share'] = 100*cmp_df.t1/T1
  cmp_df['after_segment_share'] = 100*cmp_df.t2/T2
  for p in ['before_segment_share', 'after_segment_share', 'conversion_before', 'conversion_after', 'conversion_diff',
                   'total_effect', 'mix_change_effect', 'conversion_change_effect']:
      cmp_df[p] = cmp_df[p].map(lambda x: spherical(x, 2))
  cmp_df['total_effect_share'] = 100*cmp_df.total_effect/(100*(C2/T2 - C1/T1))
  cmp_df['impact_norm'] = cmp_df.total_effect_share/cmp_df.before_segment_share

  # creating visualisations
  create_share_vs_impact_chart(cmp_df.reset_index(), dimension, 'before_segment_share', 'total_effect_share')
  cmp_df = cmp_df[['t1', 't2', 'before_segment_share', 'after_segment_share', 'conversion_before', 'conversion_after', 'conversion_diff',
                   'total_effect', 'mix_change_effect', 'conversion_change_effect', 'total_effect_share']]

  plot_conversion_waterfall(
      100*C1/T1, 100*C2/T2, cmp_df[['total_effect']].rename(columns = {'total_effect': 'impact'})
  )

  # placing collectively results break up by change of combine and conversion change
  tmp = []
  for rec in cmp_df.reset_index().to_dict('information'): 
    tmp.append(
      {
          'phase': rec[dimension] + ' - change of combine',
          'impact': rec['mix_change_effect']
      }
    )
    tmp.append(
      {
        'phase': rec[dimension] + ' - conversion change',
        'impact': rec['conversion_change_effect']
      }
    )
  effects_det_df = pd.DataFrame(tmp)
  effects_det_df['effect_abs'] = effects_det_df.impact.map(lambda x: abs(x))
  effects_det_df = effects_det_df.sort_values('effect_abs', ascending = False) 
  top_effects_det_df = effects_det_df.head(5).drop('effect_abs', axis = 1)
  plot_conversion_waterfall(
    100*C1/T1, 100*C2/T2, top_effects_det_df.set_index('phase'),
    add_other = True
  )

  create_parallel_coordinates_chart(cmp_df.reset_index(), dimension, before_field='before_segment_share', 
    after_field='after_segment_share', impact_norm_field = 'impact_norm', 
    metric_name = 'share of phase', show_mean = False)
  create_parallel_coordinates_chart(cmp_df.reset_index(), dimension, before_field='conversion_before', 
    after_field='conversion_after', impact_norm_field = 'impact_norm', 
    metric_name = 'conversion', show_mean = False)

  return cmp_df.rename(columns = {'t1': 'total_before', 't2': 'total_after'})

With that, we’re achieved with the speculation and able to apply this framework in observe. We’ll load one other dataset that features a few eventualities.

conv_df = pd.read_csv('conversion_metrics_example.csv', sep = 't')
conv_df.head()

Situation 1: Uniform conversion uplift

We are going to once more simply name the operate above and analyse the outcomes.

calculate_conversion_effects(
    conv_df, 'nation', 'converted_users_before', 'users_before', 
    'converted_users_after_scenario_1', 'users_after_scenario_1',
)

The primary state of affairs is fairly easy: conversion has elevated in all international locations by 4–7% factors, ensuing within the top-line conversion enhance as nicely.

We will see that there aren’t any anomalies in segments: the affect is correlated with the phase share, and conversion has elevated uniformly throughout all international locations.

We will have a look at the waterfall charts to see the change break up by international locations and sorts of results. Though impact estimations are usually not additive, we are able to nonetheless use them to match the impacts of various slices.

The recommended framework has been fairly useful. We have been capable of rapidly work out what’s occurring with the metrics.

Situation 2: Simpson’s paradox

Let’s check out a barely trickier case.

calculate_conversion_effects(
    conv_df, 'nation', 'converted_users_before', 'users_before', 
    'converted_users_after_scenario_2', 'users_after_scenario_2',
)

The story is extra difficult right here:

The share of UK customers has elevated whereas conversion on this phase has dropped considerably, from 74.9% to 34.8%.
In all different international locations, conversion has elevated by 8–11% factors.

Unsurprisingly, the conversion change within the UK is the most important driver of the top-line metric decline.

Right here we are able to see an instance of non-linearity: 10% of results are usually not defined by the present break up. Let’s dig one degree deeper and add a maturity dimension. This reveals the true story:

Conversion has truly elevated uniformly by round 10% factors in all segments, but the top-line metric has nonetheless dropped.
The principle cause is the rise within the share of latest customers within the UK, as these clients have a considerably decrease conversion price than common.

Right here is the break up of results by segments.

This counterintuitive impact known as Simpson’s paradox. A basic instance of Simpson’s paradox comes from a 1973 examine on graduate faculty admissions at Berkeley. At first, it appeared like males had the next likelihood of getting in than girls. Nevertheless, after they appeared on the departments individuals have been making use of to, it turned out girls have been making use of to extra aggressive departments with decrease admission charges, whereas males tended to use to much less aggressive ones. After they added division as a confounder, the info truly confirmed a small however vital bias in favour of ladies.

As at all times, visualisation can provide you a little bit of instinct on how this paradox works.

That’s it. We’ve discovered methods to break down the modifications in ratio metrics.

You’ll find the entire code and knowledge on GitHub.

Abstract

It’s been an extended journey, so let’s rapidly recap what we’ve coated on this article:

We’ve recognized two main sorts of metrics: easy metrics (like income or variety of customers) and ratio metrics (like conversion price or ARPU).
For every metric kind, we’ve discovered methods to break down the modifications and determine the principle drivers. We’ve put collectively a set of capabilities that may show you how to discover the solutions with simply a few operate calls.

With this sensible framework, you’re now absolutely outfitted to conduct root trigger evaluation for any metric. Nevertheless, there’s nonetheless room for enchancment in our resolution. In my subsequent article, I’ll discover methods to construct an LLM agent that may do the entire evaluation and abstract for us. Keep tuned!

Thank you numerous for studying this text. I hope this text was insightful for you.

Making Sense of KPI Modifications | In direction of Knowledge Science