Skip Navigation

Machine Learning

Members

2
Posts

20
Active Today

1
Created

2 yr. ago

Sort

View

nsa @kbin.social
2y ago

What's In My Big Data?

arxiv.org /abs/2310.20707

0
nsa @kbin.social
2y ago

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

arxiv.org /abs/2310.16787

0
KingsmanVince @kbin.social
2y ago

MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks

aclanthology.org /2023.acl-long.223/

0
KingsmanVince @kbin.social
2y ago

Demystifying CLIP Data

arxiv.org /abs/2309.16671

0
nsa @kbin.social
2y ago

GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

arxiv.org /abs/2310.12397

0
KingsmanVince @kbin.social
2y ago

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

arxiv.org /abs/2310.09199

3
KingsmanVince @kbin.social
2y ago

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

arxiv.org /abs/2310.09478

0
nsa @kbin.social
2y ago

Language Modeling Is Compression

arxiv.org /abs/2309.10668

0
KingsmanVince @kbin.social
2y ago

Scaling Vision-Language Models with Sparse Mixture of Experts

arxiv.org /abs/2303.07226

0
KingsmanVince @kbin.social
2y ago

Hydra-MoE: A new class of Open-Source Mixture of Experts

github.com /SkunkworksAI/hydra-moe

0
KingsmanVince @kbin.social
2y ago

Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks

arxiv.org /abs/2307.16395

0
KingsmanVince @kbin.social
2y ago

Foundational Models Defining a New Era in Vision: A Survey and Outlook

arxiv.org /abs/2307.13721

0
KingsmanVince @kbin.social
2y ago

Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training

aclanthology.org /2023.acl-long.327/

1
KingsmanVince @kbin.social
2y ago

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

arxiv.org /abs/2303.16839

1
KingsmanVince @kbin.social
2y ago

Vision Language Transformers: A Survey

arxiv.org /abs/2307.03254

0
AsAnAILanguageModel @sh.itjust.works
2y ago

SeamlessM4T: Multimodal Model for Speech Translation

0
AsAnAILanguageModel @sh.itjust.works
2y ago

Hugging Face Releases IDEFICS: An Open-Access 80B Visual Language Model Replicating DeepMind's Flamingo

0
KingsmanVince @kbin.social
2y ago

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

arxiv.org /abs/2308.06595

0
Lenguador @kbin.social
2y ago

Real-Time Radiance Field Rendering

huggingface.co /papers/2308.04079

0
Deliverator @kbin.social
2y ago

Machine Learning Beginner Info/Resources

1

0 active users