Music representations: Challenges
From MIReS
- Investigate more musically meaningful features and representations. There is still a significant semantic gap between the representations used in MIR and the concepts and language of musicians and audiences. In particular, many of the abstractions used in MIR do not make sense to a musically trained user, as they ignore or are unable to capture essential aspects of musical communication. The challenge of designing musically meaningful representations must be overcome in order to build systems that provide a satisfactory user experience. This is particularly the case for automatically generated features, such as those utilising deep learning techniques, where the difficulty is creating features well-suited for MIR tasks which are still interpretable by humans.
- Develop more flexible and general representations. Many representations are limited in scope and thus constrained in their expressive possibilities. For example, most representations have been created specifically for describing Western tonal music. Although highly constrained representations might provide advantages in terms of simplicity and computational complexity, it means that new representations have to be developed for each new task, which inhibits rapid prototyping and testing of new ideas. Thus there is a need to create representations and abstractions which are sufficiently adaptable, flexible and general to cater for the full range of music styles and cultures, as well as for unforeseen musical tasks and situations.
- Determine the most appropriate representation for each application. For some use cases it is not beneficial to use the most general representation, as domain- or task-specific knowledge might aid the analysis and interpretation of data. However, there is no precise methodology for developing or choosing representations, and existing \myq{best practice} covers only a small proportion of the breadth of musical styles, creative ideas and contexts for which representations might be required.
- Unify formats and improve system interoperability. The wealth of different standards and formats creates a difficulty for service providers who wish to create seamless systems with a high degree of interoperability with other systems and for researchers who want to experiment with software and data from disparate sources. By encouraging the use of open standards, common platforms, and formats that promote semantic as well as syntactic interoperability, system development will be simpler and more efficient.
- Extend the scope of existing ontologies. Existing ontologies cover only a small fraction of musical terms and concepts, so an important challenge is to extend these ontologies to describe all types of music-related information, covering diverse music cultures, communities and styles. These ontologies must also be linked to existing ontologies within and outside of the MIR community in order to gain maximum benefit from the data which is structured according to the ontologies.
- Create compact representations that can be efficiently used for large-scale music analysis. It is becoming increasingly important that representations facilitate processing of the vast amounts of music data that exist in current and future collections, for example, by supporting efficient indexing, search and retrieval of music data.
- Develop and integrate representations for multimodal data. In order to facilitate content-based retrieval and browsing applications, representations are required that enable comparison and combination of data from diverse modalities, including audio, video and gesture data.
Automatic feature learning looks like a challenge for music representation to me:
"Moving Beyond Feature Design: Deep Architectures and Automatic Feature Learning in Music Informatics" by Eric J. Humphrey, Juan P. Bello, Yann LeCun
http://ismir2012.ismir.net/event/papers/403-ismir-2012.pdf
See also the corresponding Late-Breaking Session: (Deep) Feature Learning
https://ismir2012.wikispaces.com/Wrap-up+Feature+Learning
Posted by ArthurF on 17:01, 31 October 2012 (UTC)