Extending deepgaze ii: Scanpath prediction from deep features

Abstract

Predicting where humans choose to fixate can help understanding a variety of human behaviour. The last years have seen substantial progress in predicting spatial fixation distributions when viewing static images. Our own model" DeepGaze II"(Kümmerer et al., ICCV 2017) extracts pretrained deep neural network features from the VGG network from input images and uses a simple pixelwise readout network to predict fixation distributions from these features. DeepGaze II is state-of-the-art for predicting freeviewing fixation densities according to the established MIT Saliency Benchmark. However, DeepGaze II predicts only spatial fixation distributions instead of scanpaths. Therefore, the models model ignores crucial structure in the fixation selection process. Here we extend DeepGaze II to predict fixation densities conditioned on the previous scanpath. We add additional feature maps encoding the previous scanpath …

Matthias Kümmerer
Matthias Kümmerer
Postdoc

I’m interested in understanding how we use eye movements to gather information about our environment. This includes building saliency models and models of eye movement prediction such as my line of DeepGaze models. I also work on the question of how to evaluate model quality and benchmarking and I’m the main organizer of the MIT/Tuebingen Saliency Benchmark.

Matthias Bethge
Matthias Bethge
Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center

Matthias Bethge is Professor for Computational Neuroscience and Machine Learning at the University of Tübingen and director of the Tübingen AI Center, a joint center between Tübingen University and MPI for Intelligent Systems that is part of the German AI strategy.