Selecting Maximally-Predictive Deep Features to Explain What Drives Fixations in Free-Viewing

Matthias Kümmerer, Thomas SA Wallis, Matthias Bethge

January, 2019

Abstract

Recent advances in deep learning have allowed to predict a substantial amount of the explainable information in the spatial fixation distribution in natural images. For example, our model DeepGaze II uses deep features from the VGG deep neural network trained on object recognition as image representation and combines them in a simple pixelwise nonlinear way to predict a fixation density. However, while these models are very successful at predicting fixations, they are mainly black boxes and therefore not very good at explaining what drives fixations. Here we address this problem by selecting features that are maximally predictive for fixations in a stepwise fashion (Baddeley & Tatler 2006). Starting from a version of DeepGaze II without any VGG features (a pure centerbias model), we first search for the VGG feature that maximally improves model performance when added to this model. Subsequently, we …

Matthias Kümmerer

Postdoc

I’m interested in understanding how we use eye movements to gather information about our environment. This includes building saliency models and models of eye movement prediction such as my line of DeepGaze models. I also work on the question of how to evaluate model quality and benchmarking and I’m the main organizer of the MIT/Tuebingen Saliency Benchmark.

Selecting Maximally-Predictive Deep Features to Explain What Drives Fixations in Free-Viewing

Abstract

Matthias Kümmerer

Postdoc

Matthias Bethge

Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center