The bittersweet lesson: data-rich models narrow the behavioural gap to human vision

Abstract

A major obstacle to understanding human visual object recognition is our lack of behaviourally faithful models. Even the best models based on deep learning classifiers strikingly deviate from human perception in many ways. To study this deviation in more detail, we collected a massive set of human psychophysical classification data under highly controlled conditions (17 datasets, 85K trials across 90 observers). We made this data publicly available as an open-sourced Python toolkit and behavioural benchmark called" model-vs-human", which we use for investigating the very latest generation of models. Generally, in terms of robustness, standard machine vision models make much more errors on distorted images, and in terms of image-level consistency, they make very different errors than humans. Excitingly, however, a number of recent models make substantial progress towards closing this behavioural gap:" …

Matthias Bethge
Matthias Bethge
Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center

Matthias Bethge is Professor for Computational Neuroscience and Machine Learning at the University of Tübingen and director of the Tübingen AI Center, a joint center between Tübingen University and MPI for Intelligent Systems that is part of the German AI strategy.