Semantic from Audio and Genre Classification for Music
This eTeam will collaborate on different methods for audio feature
extraction and their appliance in both supervized classification as well as unsuperviced organization as a means to access and explore audio holdings such as sound archives or, particularly, music. The eTeam
partners have strong expertise on extracting descriptors from audio
data, specialized on music, instruments and other sounds. Moreover,
there is also expertise on text mining and therefore textual genre
analysis will be combined with the audio-based approaches.
Furthermore, the partners have core competencies in the application of machine learning techniques for the analysis and structuring of content, and subsequent visualization in 2D as well as 3D environments.
The eTeam will be a concentrated effort on:
- feature extraction from audio
- music classification (based on audio, symbolic
notations, text mining and combined approaches)
- instrument or sound classification
- organization of music/sound by perceptual similarity
- visualizations and access interfaces for the exploration of sound collections in 2D and 3D
Furthermore, the feature sets evaluated in the classification
activities will be employed in unsupervized machine learning tasks in
order to provide an automatic clustering of audio archives, which in
turn serves as an interface for browsing and exploration. eTeam
partners will bring in expertise on visualization, providing intuitive
interfaces for both 2D visualizations as well as interactive 3D
environments for future access models to audio archives.
Results of the eTeam will be:
- Evaluation of different features sets for the
classification of music, instruments and other sounds
- Evaluation of multi-modal approaches (combining textual,
symbolic and audio features)
- Evaluation of performance of those approaches on different
- Evaluation of the feature sets on unsupervized machine
- Visualization of audio archives by
application of the feature sets
- Development of Prototypes of interactive applications (2D +
3D) for novel browsing methods for audio archives.
- Jointly written publications
- Exchanges between the eTeam partners
Contribution of partners
- TU Wien - IFS (A. Rauber, T. Lidy)
We have active a number of activities in the field of audio analysis,
particularly feature extraction for audio retrieval, mostly for genre
classification. We are participating in the annual ISMIR MIREX
contest in these disciplines. Moreover we are also active in text
mining and investigate approaches for textual genre analysis through
the use of web data.
Another focus topic, that would allow
some overlap and integration with the User Interface WP, is interfaces
to audio collections, based on such extracted features and subsequent
Machine Learning. We are collaborating on interfaces to audio
archives with the EC3 group.
- EC3 (H. Berger, M. Dittenbach)
The EC3 competence center is active in competence fields such as
information logistics (organisation of web-based distributed processes,
especially web services and interoperable solutions for the design of
virtual enterprises), structured content organisation
(automated language processing for the analysis and design of
natural-language user interfaces as well as complex
information systems) and customer and business analysis.
The competence center has expertise in knowledge management and
It is specifically active in the development of applications for browsing,
interaction and retrieval from document archives, where a core focus is on the development of 3D environments for the exploration of
text or audio archives.
- AIIA - AUTH (C. Kotropoulos,
The activity regards the automatic musical instrument classification of
isolated tones and sound segments by extracting timbral and MPEG-7
Audio features using Non-Negative Matrix factorization (NMF). A joint
publication on supervised classifiers based on the
non-negative matrix factorization (NMF), evaluating two different
feature sets, has been has been written together with TU Wien
- evaluation of feature sets through the MIREX evaluation
- joint publication of AIIA
- AUTH and TU Wien - IFS at
Emmanouil Benetos, Constantine Kotropoulos, Thomas
Lidy, Andreas Rauber: "Testing
based on non-negative matrix factorization to musical instrument
Proceeedings of the 14th
European Signal Processing Conference, Florence, Italy,
September 4-8, 2006 (PDF)
- collaboration between EC3 and TU Wien - IFS on
interactive interfaces to audio collections
Tentative plan of activities
The eTeam fosters collaborations
between the participants and the exchange of know-how in the different
domains and expertises described.
- Collection of
features sets used in the eTeam:
The result of the eTeam in this first stage would thus provide a
state-of-the-art collection of the capabilities and competences
on audio feature extraction.
The feature extraction algorithms will be run on the test databases.
For textual genre analysis and combined approaches, additional data has
to be aquired at this step.
Extracted audio features will be tested on several classification
methods. There will be a separate evaluation for each discipline: music
classification, instrument classification and classification of other
sounds. Results will indicate both algorithmic as well as computational
performance and will be published on the eTeam web site.
with multi-modal approaches:
Combined approaches including textual, symbolic and audio features will
be evaluated on the same classifiers as the plain audio features.
Results will be published on the eTeam web site.
- Clustering of
The test databases used in classification will be employed together
with the evaluated feature sets for unsupervized clustering, using
of audio archives:
Based on the clustered archives, a number of different visualizations
(views) of the each of the archives (music collections, instrument
archives, etc.) will be created and published on the eTeam web site.
- Prototype of
A prototype for access to the audio archives based on the clusterings
and visualizations will be created, suitable for browsing and access
through a web browser.
A prototype of an interactive application will be developed, providing
interaction with audio archives in a 3D environment.
eTeam activites will be
supported by exchange of researchers between the eTeam institutions as
well as writing joint publications (conference papers, articles).
Dept. of Software Technology and Interactive Systems
Vienna Univ. of Technology
Favoritenstr. 9 - 11 / 188
A - 1040 Wien