Mark As Completed Discussion

Types of Anomalies

When it comes to outlier analysis the first step is knowing what type of anomaly you are up against. Being able to accurately categorize outliers sharpens the focus of automated anomaly detection and yields much better result. Here we have three categories to categorize anomalies.

Global Outliers

A data point or points are considered to be a global outlier if their values are far outside everything else in the dataset.

For example, the exponential spike in Zoom usage at the start of the pandemic is an example of a global outlier when comparing those numbers to the pre-COVID user base. This is an example of a dream global outlier for a business.

Contextual Outliers

A data point is considered to be a contextual outlier if its value deviates significantly from the rest of the data points that are in the same context. However that same data point may not be considered an outlier if it occurs in a different context.

Let's look an example, there is a sudden surge in order volume of a TV at an eCommerce company in the middle of the night. It's a contextual outlier because you wouldn't expect this high volume to occur outside daytime. Upon further inspection the business finds a pricing glitch where someone has entered the price of the TV as €6.99 rather than the actual price of €400. This example is actually a true story from Darty, a famous French electrical retailing company.

Types of Anomalies

Collective outliers

A group of data points are considered collective outliers when they are significantly different from the rest of the entire dateset. However each data point on its own wouldn't be considered anomalous in either a contextual or a global sense. Individually the time series behavior doesn't deviate significantly from the normal range however when when the time series are combined they indicate a bigger issue.

Let's take an example, imagine you're running an ad campaign. As your budget increases you will expect to see an increase in both impressions and ad clicks. However the actual result seen is an increase in the number of impressions but a decrease in the number of ad clicks. In this case either the increase in impressions or the drop in ad clicks is not abnormal but when they happen together it suggests that you have an issue with your campaign. Perhaps you are serving an empty ad or you're serving to the wrong audience.