Overview
The Similar Loans Report Andi skill provides relationship managers with real-time, in-the-moment pricing insights, showing characteristics of the 10 most similar loans to the loan being priced at any given time. This article will cover the methodology used to identify the similar loans.
Methodology
Step 1: Filter to relevant loans
Relevant loans are found by filtering down to pipeline and portfolio loans of the same product type and from the same region as the loan currently being priced. The Pricing Region is identified as the region of comparison.
Step 2: Select relevant loan features
The below features (Table 1) are used (if relevant - for example, utilization is used only on an LOC) to identify similar loans. Features are cross referenced across pipeline and portfolio features and only factored if used in at least 66 percent of the loans. Features are ignored if they do not have enough data to reference.
Table 1. All potential loan features for determining similar loans and their weighting.
Loan Feature |
Valid For |
Required Feature |
Weighting |
Amount |
All |
Yes |
2 |
Amortization Term |
Fixed Rate |
No |
1 |
Maturity |
All |
Yes |
1 |
Age (days since pricing or origination) |
All |
Yes |
1 |
Loss given default |
All |
No |
1 |
Risk rating annual loss |
All |
No |
1 |
Utilization |
LOC |
Yes (if LOC) |
1 |
Relationship credit, deposit, and other size |
Loans on current relationships |
No |
0.5 |
Step 3: Feature Standardization and Weighting
Given the nature of loans and their distributions, a normalization process is utilized to provide a standard scale for comparison. Weightings are used to drive importance to specific features, per Table 1 (amount in this calculation).
For details, please reference the calculations section.
Table 2. Example of non standardized and standardized loan features for a set of loans where:
- Average Utilization = 0.5
- Standard Deviation Utilization = 0.08
- Average Amount = $500,000
- Standard Deviation Amount = $80,000
Note: Standardization puts the loan features values on roughly the same scale. Weighting doubles the scale of Amount, making it a more important feature.
Utilization |
Amount |
Standardized Utilization |
Standardized Amount |
Weighted Amount |
0.65 |
$450,000 |
1.875 |
-1.875 |
-3.75 |
0.45 |
$600,000 |
-0.625 |
1.25 |
2.5 |
Loans are compared to their similar Rate Types, if available. Different rate types are used if there is no usable data from the same rate type.
Step 4: Calculate most similar loans from the pipeline and portfolio
Once all features are standardized and weighted we determine the distance between the pipeline standardized current loan and all pipeline loans. Please refer to calculations for details on the formula.
Example
In the example below we have 10 pipeline loans and are only using Amount and Utilization to determine the most similar loans (Figure 1). The 5 closest loans are selected as the most similar loans according to our selected loan features (Figure 1).
Figure 1. Example scenario using only Amount and Utilization to determine similar loans. Note that the x-axis for Amount is twice as long as the y-axis for Utilization. This is because we have weighted Amount to make it twice as important.
We apply this method to calculate distance to all the selected features (Figure 2).
Figure 2. Similar loan example using 5 loan features. Each feature spreads the pipeline loans out along a different axis. The size of the axis indicates the importance of the feature in determining similar loans (Note that amount and rate type are the most important features). As in the previous example the most similar loans are the 5 pipeline loans closest to the current loan.
Table 3. Similar loans fields returned by Andi.
Column |
Description |
ROE |
ROE for loan. |
TargetROE |
Target ROE for the loan. |
NetIncome |
Net Income for the loan. |
Source |
Source of the loan one of Pipeline, Portfolio, or CurrentOpportunity. |
DaysOld |
Days since origination date for portfolio loans, days since pricing date for pipeline loans. |
ProductName |
Name of the loan product. |
RegionName |
Name of the pricing region for a pipeline loan, name of the relationship owners region for a portfolio loan. |
Amount |
Loan amount. |
RateType |
One of Fixed, Float, Adjustable, or Swap. |
Maturity |
Maturity on the loan in months. |
AmortizationTerm |
Amortization term on the loan in months. |
RiskRating |
Risk rating on the loan. |
RelationshipCreditSize |
Total credit size of the relationship tied to the loan. Null if there is no attached relationship. |
RelationshipDepositSize |
Total deposit size of the relationship tied to the loan. Null if there is no attached relationship. |
RelationshipOtherSize |
Total other size of the relationship tied to the loan. Null if there is no attached relationship. |
RiskRatingAnnualLoss |
The annual loss associated with the risk rating on the loan. |
LossGivenDefault |
The share of the loan that will be lost if the borrower defaults. |
Utilization |
The utilization on a LOC. Null if loan is not an LOC. |
Rate |
The total rate on the loan. |
AdjustableRateIndexRate |
The interest rate of the index on an adjustable or floating rate loan. |
AdjustableRateSpread |
The spread from the index rate to the current rate charged on a loan. |
FeesInitialDollars |
The Full, unamortized amount of origination fees. |
FeesAnnualDollars |
Annual values of fees on a loan account. |
Calculations
Normalizations:
Some loan features have highly skewed distributions. To adjust for this we transform skewed distributions by taking the natural log. Features also have wildly different ranges.
For example, amount may range from a few thousand dollars up to millions of dollars, while utilization can only range from 0 to 1. We put all features on the same scale by standardizing.
- First, we take the mean and standard deviation of each feature in the banks pipeline loans.
- We then subtract the mean of the feature from each loan (including the current loan) and divide by the standard deviation.
We perform this same standardization using the current loan and the portfolio loans. This results in two copies of the current loan, one standardized to your banks pipeline loans and one standardized to your banks portfolio loans. These standardized values are now all on the same scale (See Table 2). We then weight the standardized values using the weights from Table 1.
Distance of loans:
Formula 1. Euclidean distance formula