Inducing a human-like shape bias leads to emergent human-level distortion robustness in CNNs

Robert Geirhos, Patricia Rubisch, Jonas Rauber, Carlos R Medina Temme, Claudio Michaelis, Wieland Brendel, Matthias Bethge, Felix A Wichmann

September, 2019

Abstract

Convolutional neural networks (CNNs) have been proposed as computational models for (rapid) human object recognition and the (feedforward-component) of the primate ventral stream. The usefulness of CNNs as such models obviously depends on the degree of similarity they share with human visual processing. Here we investigate two major differences between human vision and CNNs, first distortion robustness—CNNs fail to cope with novel, previously unseen distortions—and second texture bias—unlike humans, standard CNNs seem to primarily recognise objects by texture rather than shape. During our investigations we discovered an intriguing connection between the two: inducing a human-like shape bias in CNNs makes them inherently robust against many distortions. First we show that CNNs cope with novel distortions worse than humans even if many distortion-types are included in the training data …

Inducing a human-like shape bias leads to emergent human-level distortion robustness in CNNs

Abstract

Matthias Bethge

Professor for Computational Neuroscience and Machine Learning & Director of the Tübingen AI Center