Paper abstract

Classification of Multi-Labeled Data: A Generative Approach

Andreas P. Streich - Institute of Computational Science, ETH Zurich, Switzerland
Joachim M. Buhmann - Institute of Computational Science, ETH Zurich, Switzerland

Session: Classification 1
Springer Link:

Multi-label classification assigns a data item to one or several classes. This problem of multiple labels arises in fields like acoustic and visual scene analysis, news reports and medical diagnosis. In a generative framework, data with multiple labels can be interpreted as additive mixtures of emissions of the individual sources. We propose a deconvolution approach to estimate the individual contributions of each source to a given data item. Similarly, the distributions of multi-label data are computed based on the source distributions. In experiments with synthetic data, the novel approach is compared to existing models and yields more accurate parameter estimates, higher classification accuracy and ameliorated generalization to previously unseen label sets. These improvements are most pronounced on small training data sets. Also on real world acoustic data, the algorithm outperforms other generative models, in particular on small training data sets.