Sound is a multivariate phenomenon that is exclusively experienced through listening. Much is lost, however, during the conversion of audio data to numerical, and finally graphical form for objective analysis - it is an abstraction process that not only removes the context from the data, but also changes the domain of the data. This presentation describes and develops a set of methods for representing data about sound using an auditory rather than visual representation. This method uses the descriptor data that are calculated from frames of the audio sample as an input to schemes for rearranging the audio sample. Various rearrangement schemes fulfil a variety of analytic purposes, based on corresponding visual statistical graphics. The use of the auditory domain for the representation of data about audio solves several representation problems that occur when using visual graphs. It assists in avoiding excessive abstraction, in grouping similar sounds together, and in developing an understanding of the descriptor being investigated. Examples of the application of the framework to typical audio information extraction situations will be included.
Authors: Sam Ferguson and Densil Cabrera
Event: SF08: Search and Information Extraction from Audio Data Workshop