Anomaly Detection in Time Series

ADTS AD

1.1 Anomaly detection in real-time streaming data

The problem of detecting anomalies in real-time streaming data has gained growing interest in the machine learning community, due to a plethora of practical industrial applications in e.g. finance, IT, security, medical, energy, e-commerce and social media. More specifically it has been successfully applied to fields such as preventative maintenance, fraud detection, fraud prevention and monitoring. The reason why applying ML algorithms to these problems is becoming more and more feasible is due to the dramatic increase in the availability of streaming time series data, largely driven by the rise that the Internet of Things paradigm has seen over the last few years.

But what is an anomaly? A definition could be that an anomaly is a point in time where the behavior of the system we are observing is unusual and significantly different from past behavior. However the standard/normal behavior could be difficult to be defined because of the great number of exogenous factors that can influence it.

There exist two different types of anomalies:

  • Spatial anomaly, where a value is anomalous when it lies outside a typical range.
  • Temporal anomaly, where a value does lie inside a typical observed range, but occurs in a sequence that is unusual.

These assumptions show how anomaly detection in streaming applications can be a challenging task. In fact, input and output data must often be processed in real-time, which can result in a great amount of information that simply cannot be handled by human intervention. Also, the underlying system is often non-stationary, meaning that the detectors must learn and adapt continuously to the changing statistics. It is therefore necessary to operate in an unsupervised, automated fashion.

1.2 Machine Learning techniques for Anomaly Detection

The above discussion about the general features of streaming data applications allows us to outline a general set of requirements for an anomaly detector algorithm:

  • it should process data in real-time;
  • it must learn continuously;
  • it must run unsupervised and automatically;
  • it must adapt to dynamic environments;
  • it should detect anomalies as early as possible.

Standard Machine Learning (ML) techniques, such as clustering and classification, do not offer a systematic approach that can fulfill all these requirements. For example classification is a supervised method, and typically supervised techniques are unsuitable for anomaly detection. Moreover, clustering methods are capable of detecting only spatial anomalies.

Artificial Neural Networks (ANN), with their unsupervised pattern matching/recognition, seem a better match for the task of anomaly detection, as there is no need for the user to define a static metric. In the context of time series analysis, Recurrent Neural Network (RNN) and Long-Short Term Memory (LSTM) have shown greater performances, especially regarding their ability to interpret a certain point in time by taking into account the temporal context in which it appears. In other words, they can learn the time pattern of a time series. Nevertheless, Artificial Neural Networks do not fulfill all the requirements sketched above. In fact, a NN must be trained on a fixed batch, so it’s not the best candidate to be used in a real-time process, since it cannot learn continuously. The network trained in this way can be used only for a small time interval before it has to be re-trained on another batch.

In order to overcome all the technical problems described above, we opted for an anomaly detection technique inspired on known properties of cortical neurons, called Hierarchical Temporal Memory (HTM). It is a theoretical framework for sequence learning in the cortex and its implementations operate in real-time and have been shown to work well for prediction tasks. HTM networks learn continuously and model the space-temporal features of their input. For this reasons, HTM network is the best candidate to implement a streaming anomaly detector. We will describe how it works in the following blog post.