# Anomaly Detection

## Purpose

Relationship deposit and line of credit account balance anomalies are provided in Data Library and via Andi Skills as part of Portfolio Insights. Anomalies are discovered by observing and analyzing trends in relationship balances for every relationship in your bank’s data. This article covers how Portfolio Insights detects anomalies in your bank’s data.

## Overview

Anomaly Detection equally rates the percent change and difference from the expected amount. Note also that:

• for a deposit, only decreases can qualify as an anomaly;
• for a line of credit, only increases can qualify as an anomaly

As an example, a deposit may appear on the anomaly report if the decrease reaches a predefined threshold as it approaches depletion.

## Methodology

Start with a time series of the total balance for a client relationship across all of their deposit or LOC accounts (Fig 1.).

Fig 1. Example deposit balance for a relationship

For this series create a measure of three features:

• The long term trend
• The spread of the values
• The anomalies in the values

### Trend

Trend is captured by fitting an exponential moving average to the series. The formula for calculating the exponential moving average can be tuned to be respond quicker or slower to new data points.

Fig 2. Total balance time series with moving average trend

Spread is captured in two steps:

1. Calculating the difference between the trend and the total balance for every point of the time series (called the residuals)
2. Taking a rolling standard deviation of the residuals

Fig 3. Total balance time series with positive residuals in black and negative residuals in red

Deposit Anomaly detection uses a rolling 365 day window, and LOC Anomaly detection uses a rolling 400 day window. A longer time period is used for LOC accounts because they tend to have more infrequent large changes than deposit accounts.

Assuming a normal distribution of residuals a prediction interval is calculated that encompasses a given percentage of data points. Currently an 80% prediction interval is used. This is calculated by multiplying the rolling standard deviation by 1.28.

Fig 4. 80% Prediction interval for trend line (Note how interval thins 365 days after large changes)

### Anomalies

Our measures of trend and spread begin to give us an idea of which points may be worth reporting as anomalies. If we only use these two measures we will see small changes following a large change get flagged as anomalies (see Feb 19 fig 4). To adjust for this a 20% threshold above and below the last balance and the trend line are added to the interval. Any new value must be at least 20% greater or less than the trend and or balance to be flagged as an anomaly.

After this adjustment the interval expands after large changes.

Fig 5. Interval expanded to include 20% threshold for anomalies

Finally the interval is shifted forward one step so that only prior data points are used to analyze a new data point. Now anything outside of our interval is flagged as an anomaly.

Fig 6. Interval shifted one step forward, anomalies flagged.

### Deposit Depletion

If the trend in balance is decreasing at the time of the most recent RA run, the slope of the tangent to the exponential moving average (derivative) is negative. The calculation of this slope is approximated taking the change in moving average balance between current and previous RA runs. This trend line is projected to calculate the number of days until the sum of account balances has depleted to \$0.

Fig 7. Trend in balance calculated to estimate days to deposit depletion.

The values found in the Data Library data set indicate the following:

• Decreasing balances are depleting (positive number of days to depletion)
• Non-positive balances have already been depleted (0 days to depletion)
• Balances which are increasing or have not changed are not depleting (null days to depletion)