Predict-ed perception of image sharpness using attention modeling with top-down pathways on semantic region features

Publication from Digital

Schwarz, M. and Wechtitsch, S., Hofmann, A. and Bailer, W., Thallinger, G., Fritz, G. and Paletta L.

In K. Holmqvist, F. Mulvey \& R. Johansson (Eds.), Book of Abstracts of the 17th European Conference on Eye Movements, ECEM 2013, 11-16 August 2013, in Lund, Sweden Journal of Eye Movement Research, , 1/2014


Automated visual quality analysis becomes an increasingly relevant aspect in efficient media production (Fassold et al., 2012, Proc. IEEE Intl. Symp. Multimedia). Quantitative measures of sharpness are used by broadcasters to identify important media characteristics such as the potential for resolution upscale. Computational models estimate the perceived sharpness to enable predictions of human response in massively automated media analyses. Comparison of fixations on different resolution imagery (Judd et al., 2010, J. of Vision) implies that consistency between lower and higher-resolution fixations depends on image content and complexity. We investigated the impact of imagecomplexity and content, with 14 persons of 25-65 years, using eye-tracking to report locations of perceived sharpness in videos with various resolutions. Mean opinion scores (MOS) were used to validate the sharpness (Ferzli et al., 2007, Proc. ICIP). We applied automated semantic video block and feature (texture, motion, faces, persons, etc.) segmentation and used the resulting regions of interest to parameterize top-down pathways of the computational attention model of (Judd et al., 2009, Proc. ICCV). This innovative model provided superior saliency detection on the video sequences and allowed relating the sharpness perception reported by the viewers to specific regions, thus increasing the reliability of the sharpness scores.