Multi-sensor Concert Recording Dataset Including Professional and User-generated Content

Publikation aus Digital

Werner Bailer, Chris Pike and Rik Bauwens and Reinhard Grandl and Mike Matton and Marcus Thaler

Proceedings of ACM Multimedia Systems Conference , 1/2015


We present a novel dataset for multi-view video and spatial audio. An ensemble of ten musicians from the BBC Philharmonic Orchestra performed in the orchestra's rehearsal studio in Salford, UK, on 25th March 2014. This presented a controlled environment in which to capture a dataset that could be used to simulate a large event, whilst allowing control over the conditions and performance. The dataset consists of hundreds of video and audio clips captured during 18 takes of performances, using a broad range of professional-and consumer-grade equipment, up to 4K video and high-end spatial microphones. In addition to the audiovisual essence, sensor metadata has been captured, and ground truth annotations, in particular for temporal
 synchronization and spatial alignment, have been created. A part of the dataset has also been prepared for adaptive content streaming. The dataset is released under a Creative Commons Attribution Non-Commercial Share Alike license and hosted on a specifically adapted content management platform.