Geo-Contextual Priors for Attentive Urban Object Recognition
Mobile vision services have recently been proposed for the support of urban nomadic users. While camera phones with image based recognition of urban objects provide intuitive interfaces for the exploration of urban space and mobile work, similar methodology can be applied to vision in mobile robots and autonomous aerial vehicles. A major issue for the performance of the service - involving indexing into a huge amount of reference images - is ambiguity in the visual information. We propose to exploit geo-information in association with visual features to restrict the search within a local context. In a mobile image retrieval task of urban object recognition, we determine object hypotheses from (i) mobile image based appearance and (ii) GPS based positioning, and investigate the performance of Bayesian information fusion with respect to benchmark geo-referenced image databases (TSG-20, TSG-40). This work specifically proposes to introduce position information as geo-contextual priors for geo-attention based object recognition to better prime the vision task. The results from geo-referenced image capture in an urban scenario prove a significant increase in recognition accuracy (> 10%) when using the geo-contextual information in contrast to omitting geo-information, the application of geo-attention is capable to improve accuracy by further > 5%.