Lecture: Artificial intelligence for automatic description of video content

Hannes Fassold presented the latest experiences with deep learning algorithms for face recognition and tracking as well as general object recognition and tracking at an AI4EU Café.

Credit: AI4Media Project
Credit: AI4Media Project


Hannes Fassold, Senior Researcher at JOANNEUM RESEARCH - DIGITAL, spoke on "Employing AI for the semantic analysis of conventional and immersive video" at the AI4EU Web-Café on Wednesday, February 10 at 15:00 CET.

AI-based methods are nowadays the best choice for automatic extraction of semantic metadata from archive content. Hannes Fassold will report on experiences with the use of deep learning algorithms for face detection and recognition as well as general object detection and tracking. He will discuss the current state of these methods and highlight issues which are still remaining, like the ethnic bias occurring in all face recognition methods trained on public datasets. He will then present use cases how these methods can help archives to annotate and exploit their content in a more convenient way. Not only conventional video content will be addressed, but also emerging content types like immersive video, which pose new challenges for archives. It will be shown how face recognition and scene object extraction can be used for the semi-automatic annotation of video content and for automatic cinematography / editing of a 360° video.


AI4Media is an European research project, which involves 30 partners, including the Smart Media Solutions Team of the Institute DIGITAL, which will focus mainly on new learning paradigms, distributed AI and content-centered AI and will be involved also in defining the roadmap for AI in media, in spreading the European AI excellence and in dissemination and exploitation activities. The project manager is senior researcher Hannes Fassold.

Hannes Fassold received a MSc degree in Applied Mathematics from Graz University of Technology in 2004. Since then he works at JOANNEUM RESEARCH, where he is currently a senior researcher at the Machine Vision Applications Group of the DIGITAL institute. His main research interests are the automatic analysis and enhancement of video (e.g. object detection & tracking, optical flow, speaker recognition, defect detection, superresolution, denoising) with deep learning methods. He has published several publications in these fields and coordinates the machine learning workflow / infrastructure at DIGITAL.