Detecting anomalies in e-commerce: how the Isolation Tree algorithm can protect your Business

Iago Modesto Brandão
2 min readApr 22, 2023

--

Portuguese version available here

Introduction

In the e-commerce environment, it is crucial to identify possible anomalous customer behavior, for this could indicate a malicious attack. These behaviors can include over-purchasing, trying to log in with incorrect passwords, or even attempting financial fraud. If not addressed, these anomalous behaviors can cause significant damage to the company, including financial loss and damage to brand reputation.

A solution to identify and treat these anomalous behaviors is to use the Isolation Tree algorithm. This algorithm is based on the creation of decision trees that divide the dataset into smaller subsets, thus identifying possible outliers, that is, data that deviate from the normal pattern of behavior. This algorithm is particularly useful for identifying potential fraud on credit cards or user accounts.

Code - Hands-on

A code example using the Isolation Tree algorithm might be as follows:

from sklearn.ensemble import IsolationForest

hist_observations = [[700.1], [450], [100], [200], [140], [250], [200], [320]]
new_observations = [[0.1], [0], [90],[195], [800],[122]]

clf = (
IsolationForest(random_state=10,
contamination = 'auto')
.fit(hist_observations)
)

clf.predict(new_observations)py

This code creates an Isolation Tree model in Python. We input a dataset of historical observations (hist_observations) and a set of new observations (new_observations), fit the model to the historical data, and predict anomalies in the new observations. The Isolation Tree model returns -1 for data that is considered anomalous and 1 for normal data.

Note that, to prevent real-time attacks, new observations could be received in real-time as well!

Executing the code, we will have the result below, indicating that all new records are anomalous, that is, very different from what was observed historically, except for the value 195 .

# Input [[0.1], [0], [90],[195], [800],[122]]

# Output of clf.predict(new_observations)
array([-1, -1, -1, 1, -1, -1])

Closing thoughts

The Isolation Tree model is a very useful tool to identify the first and main anomalous behaviors in online business environments, but it is worth remembering that it must be used in conjunction with other security solutions, since the algorithm may not detect 100% of all types of fraud and malicious activities.

In conclusion, identifying and dealing with anomalous behavior is essential in an e-commerce environment to avoid financial damage and damage to the company’s reputation. The use of the Isolation Tree algorithm is a viable solution to this problem, allowing companies to identify potential fraud and protect their business against malicious behavior.

Create connections

Did you like the content? Let’s have a coffee, add me on LinkedIn to exchange ideas and share knowledge!

https://www.linkedin.com/in/iagombrandao

References

Liu, F., Xie, X., & Zhou, L. (2019). Anomaly Detection for E-commerce Platform Based on Improved Isolation Forest Algorithm

Chen, X., Luo, C., & Wang, H. (2021). Anomaly detection for e-commerce user behavior based on improved Isolation Forest algorithm

--

--

Iago Modesto Brandão

Passionate by tech and all possibilities, come with us to learn more and develop the next step of the world?