Paper abstract

Clustering Distributed Sensor Data Streams

Pedro Rodrigues-Pereira - University of Porto, Portugal
João Gama - University of Porto, Portugal
Luís Lopes - University of Porto, Portugal

Session: Clustering 2
Springer Link:

In this work we study the problem of continuously maintain a cluster structure over the data points generated by a sensor network. We propose DGClust, a new distributed algorithm which reduces both the dimensionality and the communication burdens, by allowing each local sensor to keep an online discretization of its data stream. Each new data point triggers a cell in this univariate grid, reflecting the current state of the data stream at the local site. Whenever a local site changes its state, it notifies the central server about the new state it is in. The central site keeps a small list of counters of the most frequent global states. A simple adaptive partitional clustering algorithm is applied to the frequent states central points, providing an anytime definition of the clusters centers. The approach is evaluated in the context of distributed sensor networks, presenting empirical and theoretical evidence of its advantages.