Responses to natural stimuli in area V4, a mid-level area of the visual ventral stream, are well predicted by features from convolutional neural networks (CNNs) trained on image classification. This result has been taken as evidence for the functional role of V4 in object classification. However, we currently do not know if and to what extent V4 plays a role in solving other computational objectives. Here, we investigated normative accounts of V4 by predicting macaque single-neuron responses to natural images from the representations extracted by 23 CNNs trained on different computer vision tasks including semantic, geometric, 2D, and 3D visual tasks. We found that semantic classification tasks do indeed provide the best predictive features for V4. Other tasks (3D in particular) followed very closely in performance, but a similar pattern of tasks performance emerged when predicting the activations of a network exclusively trained on object recognition. Thus, our results support V4’s main functional role in semantic processing. At the same time, they suggest that V4’s affinity to various 3D and 2D stimulus features found by electrophysiologists could be a corollary of a semantic functional goal.