AI Models / Compare
Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and visual question answe
- Creator
- Meta AI
- Lifecycle
- Active
- Context
- 131.1K
- Max output
- 16.4K
- Released
- Sep 25, 2024
- Status
- unknown
- Input
- $0.05 / 1M tokens
- Output
- $0.05 / 1M tokens
- Cached read
- — / 1M tokens
- Cached write
- — / 1M tokens
- Batch discount
- —%
- Source
- OpenRouter
- Verified
- Apr 5, 2026 (High)
Capabilities
- Modalities
- textimage→text
- Capabilities
- imageInputstructuredOutputs
Other models from Meta AI
Llama 3 70B Instruct, Llama 3 8B Instruct, Llama 3.1 405B, Llama 3.1 70B, Llama 3.1 70B Instruct, Llama 3.1 8B, Llama 3.1 8B Instruct, Llama 3.2 11B Vision, Llama 3.2 1B Instruct, Llama 3.2 3B Instruct, Llama 3.2 3B Instruct, Llama 3.2 90B Vision, Llama 3.3 70B, Llama 3.3 70B Instruct, Llama 3.3 70B Instruct, Llama 4 Maverick, Llama 4 Scout, Llama Guard 3 8B, Llama Guard 4 12B