Abstract
Data mining in medical databases often involves the comparison of time series which represent the evolution of a physiological variable. Temporal misalignment of physiological variables can conceal the discovery of patterns and trends shared between different patients. To address this problem, this paper proposes the mixed fuzzy clustering (MFC) algorithm with the dynamic time warping (DTW) distance. We developed the MFC algorithm by i) incorporating the DTW distance into the standard fuzzy c-means to handle misaligned time series; ii) introducing a new dimension into the spatio-temporal clustering algorithm in order to handle *P* time variant features and iii) incorporating unsupervised learning of cluster dependent attribute weights. The algorithm is designed to simultaneously cluster time variant and time invariant data. We demonstrate the advantages of the proposed algorithm in four synthetic datasets and in two real world applications in intensive care units. The first application is the classification of patients who will need the administration of vasopressors, the second is the classification of patients with high risk of mortality. Time variant features consist of physiological variables collected with different sampling rates at different points in time. Time invariant features consist of patients' demographics and score records. The performance is evaluated using cluster validity measures, showing that the proposed algorithm outperforms fuzzy c-means.
Code available in: Visualization