The ability to separate audio sources from a song mix has many benefits, including the possibility of extracting single musical instruments from more complex musical arrangements. As most popular music is multitimbral, the challenge is to extract
the individual instruments so that the original instrument timbre remains intact. This paper examines some Japanese Popular Music and reports on the performance of three source separation algorithms. Feedback was provided by experienced musicians who completed an online survey. In the case where song mixes contained two concurrently playing instruments, the non-negative matrix factor 2-d deconvolution algorithm ranked the highest 57% of the time, with the next best being the average harmonic structure modeling algorithm which ranked the highest 38% of the time. The third projected gradient methods for non-negative matrix factorization algorithm only ranked first for the remaining 5% of the time. The acoustic guitar was the only instrument category to return a similar ranking
of the two most popular algorithms. All other instrument categories resulted in a higher ranking of the non-negative matrix factor 2-d deconvolution algorithm than the harmonic structure modeling algorithm. Where song mixes contained three concurrently playing instruments, the non-negative
matrix factor 2-d deconvolution algorithm was overwhelmingly the best algorithm and it ranked the highest 86% of the time. The only other algorithm used in this case was the projected
gradient methods for non-negative matrix factorization algorithm which ranked high on very few occasions.
Authors: Peter Somerville and Alexandra L. Uitdenbogerd
Event: SF08: Search and Information Extraction from Audio Data Workshop