Long Context Transfer from Language to Vision

Jun 27

Today's paper presents a method to enable large multimodal models (LMMs) to understand extremely long videos by leveraging the context length capabilities of the underlying language model. This is achieved through a method called "long context transfer" from text to vision.

Read →

0 Comments

AI Paper of the Day

Long Context Transfer from Language to Vision