🔖 Weekly pick from the #MeetweenScientificWatch: "PLLaVA: Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning" - A resource-light approach for adapting image-language models for video tasks, achieving new SOTA performance.
arxiv.org/abs/2404.16994