Shifttree: An interpretable model-based approach for time series classification. (English)
Gunopulos, Dimitrios (ed.) et al., Machine learning and knowledge discovery in databases. European conference, ECML PKDD 2011, Athens, Greece, September 5‒9, 2011. Proceedings, Part II. Berlin: Springer (ISBN 978-3-642-23782-9/pbk). Lecture Notes in Computer Science 6912. Lecture Notes in Artificial Intelligence, 48-64 (2011).
Summary: Efficient algorithms of time series data mining have the common denominator of utilizing the special time structure of the attributes of time series. To accommodate the information of time dimension into the process, we propose a novel instance-level cursor based indexing technique, which is combined with a decision tree algorithm. This is beneficial for several reasons: (a) it is insensitive to the time level noise (for example rendering, time shifting), (b) its working method can be interpreted, making the explanation of the classification process more understandable, and (c) it can manage time series of different length. The implemented algorithm named ShiftTree is compared to the well-known instance-based time series classifier 1-NN using different distance metrics, used over all 20 datasets of a public benchmark time series database and two more public time series datasets. On these benchmark datasets, our experiments show that the new model-based algorithm has an average accuracy slightly better than the most efficient instance-based methods, and there are multiple datasets where our model-based classifier exceeds the accuracy of instance-based methods. We also evaluated our algorithm via blind testing on the 20 datasets of the SIGKDD 2007 Time Series Classification Challenge. To improve the model accuracy and to avoid model overfitting, we provide forest methods as well.