New enhancements to the DeepGaze models for a better understanding of human scanpaths
Matthias Kümmerer,
Akis Linardos,
Matthias Bethge
September, 2021
Abstract
The family of DeepGaze models comprises deep learning based computational models of freeviewing overt attention. DeepGaze II predicts freeviewing fixation locations (Kümmerer et al, ICCV 2017) and DeepGaze III (Kümmerer at al, CCN 2019) predicts freeviewing sequences of fixations. The models encode image information using deep features from pretrained deep neural networks to compute a spatial saliency map, which, in case of DeepGaze III, is then combined with information about the scanpath history to predict the next fixation. Both models have set the state of the art in their respective tasks in the last years. Here, we improve the performance of both models substantially. We replace the backbone deep neural network VGG-19 with better performing networks such as DenseNet. We also improve the architecture of the model and the training procedure. This results in a substantial performance …
Matthias Kümmerer
Postdoc
I’m interested in understanding how we use eye movements to gather information about our environment. This includes building saliency models and models of eye movement prediction such as my line of DeepGaze models. I also work on the question of how to evaluate model quality and benchmarking and I’m the main organizer of the MIT/Tuebingen Saliency Benchmark.
Matthias Bethge
Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center
Matthias Bethge is Professor for Computational Neuroscience and Machine Learning at the University of Tübingen and director of the Tübingen AI Center, a joint center between Tübingen University and MPI for Intelligent Systems that is part of the German AI strategy.