Online time data series pre-processing for the improved performance of anomaly detection methods
Conference object (Published version)
MetadataShow full item record
The number of automated measuring and reporting systems used in water distribution and sewer systems is dramatically increasing and, as a consequence, so is the volume of acquired data. Since the real time data is likely to contain a certain amount of anomalous values and since the probability of equipment malfunction is high, it is essential to equip the SCADA with automatic procedures that will detect the problems and assist the user in monitoring and data management. A number of anomaly detection techniques and methods exist that can be used with varying success. Some of those techniques in some cases are applicable to the online usage (inspection of the incoming data streams) but usually are more suitable for the offline data processing since they require frequent expert's involvement in parameter adjustment. The aim of this paper is to explore the online and offline data pre-processing techniques that could be used to remove redundant information and reduce the total volume of acq...uired data whilst preserving all the necessary data series features that could be used for anomaly detections. The paper explores the usefulness of different pre-processing techniques as a tool for improving the anomaly detection methods. The methodology developed is tested on several sets of real-life data, with different anomaly detection procedures including statistical, model-based and data mining approaches. The results obtained demonstrate the effectiveness of the suggested methodology.