Thèse de doctorat (Mémoires et thèses)
A Combining Approach to Cover Song Identification
Osmalsky, Julien
2017
 

Documents


Texte intégral
phd-thesis-osmalskyj.pdf
Postprint Auteur (19.45 MB)
Télécharger
Annexes
phd-presentation-osmalskyj.zip
Postprint Éditeur (25.59 MB)
Télécharger

Tous les documents dans ORBi sont protégés par une licence d'utilisation.

Envoyer vers



Détails



Mots-clés :
Cover Song Identification; Combining; Music Information Retrieval; MIR; Audio Features; Rank Aggregation; Million Song Dataset; Second Hand Song Dataset; QMax
Résumé :
[en] This thesis is concerned with the problem of determining whether two songs are different versions of each other. This problem is known as the problem of cover song identification, which is a challenging task, as different versions of the same song can differ in terms of pitch, tempo, voicing, instrumentation, structure, etc. Our approach differs from existing methods, by considering as much information as possible to identify cover songs. More precisely, we consider audio features spanning multiple musical facets, such as the tempo, the duration, the harmonic progression, the musical structure, the relative evolution of timbre, among others. In order to do that, we evaluate several state-of-the-art systems on a common database, containing 12,856 songs, that is a subset of the Second Hand Song dataset. In addition to evaluating existing systems, we introduce our own methods, based on global features, and making use of supervised machine learning algorithms to build a similarity model. For evaluating and comparing the performance of 10 cover song identification systems, we propose a new intuitive evaluation space, based on the notions of pruning and loss. Our evaluation space allows to represent the performance of the selected systems in a two dimensional space. We further demonstrate that it is compatible with standard metrics, such as the mean rank, the mean reciprocal rank and the mean average precision. Using our evaluation space, we present a comparative analysis of 10 systems. The results show that few systems are usable in a commercial system, as the most efficient is able to identify a match at the first position for 39% of the analyzed queries, which corresponds to 4,965 songs. In addition, we evaluate the systems when they are pushed to their limits, by analyzing how they perform when the audio signal is strongly degraded. To improve the identification rate, we investigate ways of combining 10 systems. We evaluate rank aggregation methods, that aim at aggregating ordered lists of similarity results, to produce a new, better ordering of the database. We demonstrate that such methods produce improved results, especially for early pruning applications. In addition to evaluating rank aggregation techniques, we propose to study combination through probabilistic rules. As the 10 selected systems do not all produce probabilities of similarity, we investigate calibration techniques to map scores to relevant posterior probability estimates. After the calibration process, we evaluate several probabilistic rules, such as the product, the sum, and the median rule. We further demonstrate that a subset of the 10 initial systems produces better performance than the full set, thus showing that some systems are not relevant to the final combination. Applying a probabilistic product rule to a subset of systems significantly outperforms any individual systems, on the considered database. In terms of direct identification (top-1), we achieve an improvement of 10% (5,460 tracks identified), and in terms of mean rank, mean reciprocal rank and mean average precision, we respectively improve the performance by 40%, 9.5%, and 12.5%, with respect to the previous state-of-the-art performance. We further implement our final combination in a practical application, named DISCover, giving the possibility for a user to select a query and listen to the produced list of results. While a cover is not systematically identified, the produced list of songs is often musically similar to the query.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
Osmalsky, Julien ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
Langue du document :
Anglais
Titre :
A Combining Approach to Cover Song Identification
Date de soutenance :
05 octobre 2017
Nombre de pages :
187
Institution :
ULiège - Université de Liège
Intitulé du diplôme :
Doctorat en Sciences Informatiques
Promoteur :
Embrechts, Jean-Jacques ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore)
Van Droogenbroeck, Marc  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Secrétaire :
Geurts, Pierre ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Membre du jury :
Dixon, Simon
Peeters, Geoffroy
Dupont, Stéphane
Disponible sur ORBi :
depuis le 10 octobre 2017

Statistiques


Nombre de vues
347 (dont 22 ULiège)
Nombre de téléchargements
320 (dont 12 ULiège)

Bibliographie


Publications similaires



Contacter ORBi