Skip to the content | Change text size

Automatic Music Feature Extraction, Classification and Annotation

A huge amount of music is now available on the Internet and digital devices. For example, an iPod with 60 GB storage can be purchased at a reasonable price and can store a personal music collection of many thousand music pieces. Online stores such as mp3.com provide customers with the opportunity to select and buy music from a very large range.

These large digital music collections need to be classified and annotated in effective ways in order for users to access relevant music pieces quickly. Music classification refers to the process of dividing music into broad classes such as music genre, while music annotation refers to the process of providing more detailed description of music, commonly using emotion terms such as “fast and exciting”.

In current practice, music classification is usually a manual process and is time consuming. Over the past few years, an increasing amount of research activities has been taking place in automatic music classification. Two main stages are involved in the process. The first stage is to extract/determine key low-level music features. The second stage is to classify music based on the extracted key features.

The effectiveness of current music classification systems is mainly hampered by two issues. Firstly, the effectiveness is judged by human perception and semantics while the low level music features are mostly statistics of music sample values. There is a semantic gap: the gap between the low level features and high level semantics which human beings perceive and understand. It is reasonable to assume that the more perceptual and meaningful the music features, the more useful they will be for music classification and other applications. Secondly, the suitability of different classification methods/algorithms for music classification has not been studied thoroughly. A specific classification algorithm or configuration will likely be needed for effective music classification. According to literature, the most useful characterization of music is based on mood/emotion, genre, and similarity.

The music classes allow very limited search capability. To provide search capability, music pieces need to be annotated with more detailed description. It will be time consuming for artists to manually annotate a large number of music pieces. Therefore it will be useful to develop techniques to automatically annotate music pieces using machine learning based on automatically extracted music perceptual features and a small number (e.g. 1000) of manually annotated music pieces (as training data).

Lead researchers: