TITLE: Signal Detection using ICA: Application to Chat Room Topic Spotting
AUTHORS: Thomas Kolenda, Lars Kai Hansen and Jan Larsen
Informatics and Mathematical Modelling, Building 321
Technical University of Denmark, DK-2800 Lyngby, Denmark
emails: asz,jl,lkhansen@imm.dtu.dk
www: http://eivind.imm.dtu.dk
ABSTRACT:
Signal detection and pattern recognition for online grouping huge amounts
of data and retrospective analysis is becoming increasingly
important as knowledge based standards, such as XML and advanced
MPEG, gain popularity. Independent component analysis (ICA) can be used to
both cluster and detect signals with weak a priori assumptions in
multimedia contexts. ICA of real world data is typically performed
without knowledge of the number of non-trivial independent
components, hence, it is of interest to test hypotheses concerning
the number of components or simply to test whether a given set of
components is significant relative to a ``white noise'' null
hypothesis. It was recently proposed to use the so-called Bayesian
information criterion (BIC) approximation, for estimation of such
probabilities of competing hypotheses. Here, we apply this
approach to the understanding of chat. We show that ICA can detect
meaningful context structures in a chat room log file.
submission for ICA'2001, San Diego, USA, December 9-13, 2001.