DeepGaze II: A big step towards explaining all information in image-based saliency

Matthias Kammerer, Matthias Bethge

August, 2016

Abstract

When free-viewing scenes, the first few fixations of human observers are driven in part by bottom-up attention. Over the last decade various models have been proposed to explain these fixations. We recently standardized model comparison using an information-theoretic framework and were able to show that these models captured not more than 1/3 of the explainable mutual information between image content and the fixation locations, which might be partially due to the limited data available (Kuemmerer et al, PNAS, in press). Subsequently, we have shown that this limitation can be tackled effectively by using a transfer learning strategy. Our model" DeepGaze I" uses a neural network (AlexNet) that was originally trained for object detection on the ImageNet dataset. It achieved a large improvement over the previous state of the art, explaining 56% of the explainable information (Kuemmerer et al, ICLR 2015). A …

DeepGaze II: A big step towards explaining all information in image-based saliency

Abstract

Matthias Bethge

Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center