What distinguishes strong fashions from non-robust ones? Whereas for ImageNet distribution shifts it has been proven that such variations in robustness will be traced again predominantly to variations in coaching information, to this point it isn’t recognized what that interprets to by way of what the mannequin has discovered. On this work, we bridge this hole by probing the illustration areas of 16 strong zero-shot CLIP imaginative and prescient encoders with numerous backbones (ResNets and ViTs) and pretraining units (OpenAI, LAION-400M, LAION-2B, YFCC15M, CC12M and DataComp), and evaluating them to the illustration areas of much less strong fashions with similar backbones, however completely different (pre)coaching units or targets (CLIP pretraining on ImageNet-Captions, and supervised coaching or finetuning on ImageNet).Via this evaluation, we generate three novel insights. Firstly, we detect the presence of outlier options in strong zero-shot CLIP imaginative and prescient encoders, which to the very best of our information is the primary time these are noticed in non-language and non-transformer fashions. Secondly, we discover the existence of outlier options to be a sign of ImageNet shift robustness in fashions, since we solely discover them in strong fashions in our evaluation. Lastly, we additionally examine the variety of distinctive encoded ideas within the illustration area and discover zero-shot CLIP fashions to encode the next variety of distinctive ideas of their illustration area. Nevertheless, we don’t discover this to be an indicator of ImageNet shift robustness and hypothesize that it’s somewhat associated to the language supervision. For the reason that presence of outlier options will be detected with out entry to any information from shifted datasets, we imagine that they could possibly be a great tool for practitioners to get a sense for the distribution shift robustness of a pretrained mannequin throughout deployment.