BLIP-2 (Q-Former)
Salesforce ResearchVisual question answeringImage captioningOpen weights
BLIP-2 (Q-Former) is visual question answering model published by Salesforce Research in 2023 featuring 1480000000.0 parameters.
About BLIP-2 (Q-Former)
The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and efficient pre-training strategy that bootstraps vision-language pre-training
Details
- Provider
- Salesforce Research
- Task
- Visual question answering,Image captioning
- Parameters
- 1480000000.0
- Released
- 2023-01-30
- Open weights
- Yes