DeepGaze IIE: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling

Abstract

Since 2014 transfer learning has become the key driver for the improvement of spatial saliency prediction-however, with stagnant progress in the last 3-5 years. We conduct a large-scale transfer learning study which tests different ImageNet backbones, always using the same read out architecture and learning protocol adopted from DeepGaze II. By replacing the VGG19 backbone of DeepGaze II with ResNet50 features we improve the performance on saliency prediction from 78% to 85%. However, as we continue to test better ImageNet models as backbones-such as EfficientNetB5-we observe no additional improvement on saliency prediction. By analyzing the backbones further, we find that generalization to other datasets differs substantially, with models being consistently overconfident in their fixation predictions. We show that by combining multiple backbones in a principled manner a good confidence calibration on unseen datasets can be achieved. This new model" DeepGaze IIE" yields a significant leap in benchmark performance in and out-of-domain with a 15 percent point improvement over DeepGaze II to 93% on MIT1003, marking a new state of the art on the MIT/Tuebingen Saliency Benchmark in all available metrics (AUC: 88.3%, sAUC: 79.4%, CC: 82.4%).

Matthias Kümmerer
Matthias Kümmerer
Postdoc

My research interests include eye movements, saliency prediction, benchmarking, representation learning and statistics.

Ori Press
Ori Press
PhD candidate
Matthias Bethge
Matthias Bethge
Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center

Matthias Bethge is Professor for Computational Neuroscience and Machine Learning at the University of Tübingen and director of the Tübingen AI Center, a joint center between Tübingen University and MPI for Intelligent Systems that is part of the German AI strategy.