No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
vladbogo.substack.com
This paper investigates the relationship between the frequency of concepts in the pretraining datasets of multimodal models and the models' zero-shot performance on downstream tasks involving those concepts. The authors find that across a wide range of models and datasets, there is a consistent log-linear scaling trend between concept frequency and zero-shot performance, indicating that these models require exponentially more pretraining data to achieve linear improvements in downstream performance.
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
No "Zero-Shot" Without Exponential Data…
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
This paper investigates the relationship between the frequency of concepts in the pretraining datasets of multimodal models and the models' zero-shot performance on downstream tasks involving those concepts. The authors find that across a wide range of models and datasets, there is a consistent log-linear scaling trend between concept frequency and zero-shot performance, indicating that these models require exponentially more pretraining data to achieve linear improvements in downstream performance.