Skip to content

VILA-13B

NVIDIAMassachusetts Institute of Technology (MIT)ChatVisual question answeringImage captioningLanguage modeling/generationQuestion answeringOpen weights

Developed by NVIDIA,Massachusetts Institute of Technology (MIT) in 2023, VILA-13B is a chat model with 13350839296.0 parameters with openly available weights.

About VILA-13B

Visual language models (VLMs) rapidly progressed with the recent success of large language models. There have been growing efforts on visual instruction tuning to extend the LLM with visual inputs, but lacks an in-depth study of the visual language p

Details

Provider
NVIDIA,Massachusetts Institute of Technology (MIT)
Task
Chat,Visual question answering,Image captioning,Language modeling/generation,Question answering
Parameters
13350839296.0
Released
2023-12-12
Open weights
Yes
View model source

Explore

FAQ