Innovative multimedia computing technologies at the ECCV 2020

Experts of the "Smart Media Solutions" group of JOANNEUM RESEARCH present technologies for automated analysis, interpretation and application of the novel media format 360° at the renowned Computer Vision Conference.

Innovative multimedia computing technologies at the ECCV 2020
Credit: Mediaset / RTI

The 2020 “European Conference on Computer Vision” (ECCV 2020) is one of the premier conferences for computer vision, multimedia and artificial intelligence and will take place online from 23 to 28 August 2020 this year. The biennial conference is an international forum for researchers to exchange information on the state-of-the-art and practice of multimedia computing, identify emerging research topics and define the future of multimedia computing.


Hannes Fassold, Senior Researcher of the Institute DIGITAL presents a method for automatic camera path generation from 360° video - a result of the Hyper360 research project.


Omnidirectional (360°) videos is a novel media format rapidly becoming adopted in media production and consumption as part of today’s ongoing virtual reality revolution, because it allows the viewer to experience the content in an immersive and interactive way. A critical factor for the success of 360° video content is the availability of convenient tools for producing and editing 360° video content for a multitude of platforms. The ongoing Hyper360 project is developing an innovative end-to-end solution for capturing, producing, enhancing, delivering and using 360° video to simplify content production.


The goal of the method of the automatic camera path generation is to automatically calculate a visually interesting camera path from a video in order to provide a traditional TV-like consumption experience. This is necessary for viewing a 360° video on older TV sets that do not offer any kind of interactive players for 360° video. The proposed method for automatic camera path generation is based on the identification of objects in a scene and involves deep learning technologies like object detection & tracking and human pose estimation.