Skip Navigation

User banner
Posts
12
Comments
4
Joined
2 yr. ago

Machine Learning @kbin.social

MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks

Machine Learning @kbin.social

Demystifying CLIP Data

  • Machine Learning @kbin.social

    PaLI-3 Vision Language Models: Smaller, Faster, Stronger

    Machine Learning @kbin.social

    MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

    Machine Learning @kbin.social

    Scaling Vision-Language Models with Sparse Mixture of Experts

    Machine Learning @kbin.social

    Hydra-MoE: A new class of Open-Source Mixture of Experts

    Machine Learning @kbin.social

    Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks

    Machine Learning @kbin.social

    Foundational Models Defining a New Era in Vision: A Survey and Outlook

    Machine Learning @kbin.social

    Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training

    Machine Learning @kbin.social

    MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

    Machine Learning @kbin.social

    Vision Language Transformers: A Survey

    Machine Learning @kbin.social

    VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use