Humans gather high-resolution visual information only in the fovea, therefore they must make eye movements to explore the visual world. The spatio-temporal fixation patterns (scanpaths) of observers carry information about which aspects of the environment are currently relevant. Most of the recent progress on predicting the spatial and spatio-temporal patterns of human scanpaths has been focused on free-viewing conditions. However, fixations and scanpaths are known to be strongly influenced by the task performed by observers. The purpose of this work is to analyze those influences in a quantitative way. The DeepGaze III model for scanpath prediction (Kümmerer et al, VSS 2017) has been shown to achieve high performance in predicting free-viewing scanpaths. DeepGaze III extracts features from the VGG deep neural network that are used in a readout network to predict a saliency map, which is then …