The paper introduces GiT, a Generalist Vision Transformer that can handle various vision tasks, ranging from image-level understanding to object-level and pixel-level tasks, using a simple multi-layer transformer architecture without any task-specific additions.
Share this post
GiT: Towards Generalist Vision Transformer…
Share this post
The paper introduces GiT, a Generalist Vision Transformer that can handle various vision tasks, ranging from image-level understanding to object-level and pixel-level tasks, using a simple multi-layer transformer architecture without any task-specific additions.