Market Abuse, Recommendation Systems and Anomaly Detection

RecSys AD MA

As the first thread of our blog, we start with a rather ambitious project: applying Recommendation Systems techniques to create an anomaly detection tool. The approach is quite general, but here we will tell you specifically about our first experiment in the Market Abuse field. In this first post of the series we will briefly describe the terms of the matter.

open

24 September 2020

Anomaly Detection with Recommendation Systems /2

RecSys for MAD: an empirical study

RecSys

In this second episode of the series we'll introduce the ‘real-world’ dataset we've been dealing with. In particular we will discuss the foremost step of the hyperparameter selection phase, namely the mapping we've adopted in order to be able to feed this dataset into a RecSys facility.

open

02 December 2020

Anomaly Detection with Recommendation Systems /3

Evaluation of a RecSys

RecSys

Next steps are tuning the remaining hyperparameters and training the model: but how to establish the soundness and the accuracy of the calibration outcome? The ROC curve, the AOC and the precision/recall at k are the standard metrics aimed at this purpose.

open

15 January 2021

Anomaly Detection with Recommendation Systems /4

Evaluation of a RecSys as Anomaly Detector

RecSys AD

Using a RecSys as an anomaly detector undermines the possibility of a supervised-learning approach, so traditional mean average precision metrics are not enough. But even before that, we had to face a fitting convergence evaluation problem.

open

22 January 2021

Anomaly Detection with Recommendation Systems /5

Universal Anomaly Score

RecSys AD

In this post we show some examples of anomaly rank results, highlighting the need for transformation of such values that makes the score interpretable in a universal way, overcoming the specific scale and shape of each RecSys outcome.

open

29 January 2021

Anomaly Detection with Recommendation Systems /6

RecSys for MAD: backtesting results

RecSys AD

As a final review of our experiment, we performed a backtesting analysis — sort of, since we are in a non-supervised learning setup. Here we present the main outcomes with an attempt to statistically inspect the resulting top anomalies.

open