Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Sparse Coding for Efficient Bioacoustic Data Mining: Preliminary Application to Analysis of Whale Songs
Ist Teil von
2015 IEEE International Conference on Data Mining Workshop (ICDMW), 2015, p.780-787
Ort / Verlag
IEEE
Erscheinungsjahr
2015
Quelle
IEEE Electronic Library (IEL)
Beschreibungen/Notizen
Bioacoustic monitoring, such as surveys of animal populations and migration, needs efficient data mining methods to extract information from large datasets covering multi-year and multi-location recordings. This paper introduces a method for sparse coding of bioacoustic recordings in order to efficiently compress and automatically extract patterns in data. We demonstrate the proposed method on the analysis of humpback whale songs. Previous work suggests that the structure of these songs can be characterized by successive vocalizations called sound units. Most of these analyses are currently done with expert intervention, but the volume of recordings drive the need for automated methods for sound unit classification. This paper proposes that sparse coding of the song at different time scales supports the distinction of stable song components versus those which evolve year to year. The approach is summarized as: first, an unsupervised method is used to encode the entire bioacoustic dataset into a dictionary, second, sparse coding is used to limit the number of elements in the dictionary, third, salient features are identified using the Lasso algorithm, and finally, an interpretation of the evolving and stable components of the songs is derived, supporting an analysis of year to year variation. It is shown that shorter codes are more stable, occurring with similar frequency across two consecutive years, while the occurrence of longer units varies across years as expected based on the prior manual analysis. 250 ms segments appear to be an appropriate length for encoding stable features of whale songs, possibly corresponding to subunits. We conclude by exploring further possibilities of the application of this method for biopopulation analysis.