An Attentive Machine Interface Using Geo-Contextual Awareness for Mobile Vision Tasks

Publication from Digital

Amlacher K., Paletta L.

Proc. of European Conference on Artificial Intelligence, ECAI 2008 Patras, Greece , 2008


The presented work settles attention in the architecture of ambient intelligence, in particular, for the application of mobile vision tasks in multimodal interfaces. A major issue for the performance of these services is uncertainty in the visual information which roots in the requirement to index into a huge amount of reference images. We propose a system implementation – the Attentive Machine Interface (AMI) – that enables contextual processing of multi-sensor information in a probabilistic framework, for example to exploit contextual information from geo-services with the purpose to cut down the visual search space into a subset of relevant object hypotheses. We present a proof-of-concept with results from bottom-up information processing from experimental tracks and image capture in an urban scenario, extracting object hypotheses in the local context from both (i) mobile image based appearance and (ii) GPS based positioning, and verify performance in recognition accuracy (> 10%) using Bayesian decision fusion. Finally, we demonstrate that top-down information processing – geo-information priming the recognition method in feature space – can yield even better results (> 13%) and more economic computing, verifying the advantage of multi-sensor attentive processing in multimodal interfaces.