Qwen

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.

Context Window
131K tokens
Parameters
32B
Input Price
$0.10/1M
Output Price
$0.42/1M
Price Tier
budget
Provider
Qwen

How to Use Qwen: Qwen3 VL 32B Instruct

With CoreAI, you can start chatting with Qwen: Qwen3 VL 32B Instruct instantly — no separate subscription needed. CoreAI bundles access to Qwen: Qwen3 VL 32B Instruct along with 300+ other AI models from Qwen and other providers like OpenAI, Anthropic, Google, Meta, and more.

  1. Download the CoreAI app for iOS, Android, or use the Web App
  2. Select Qwen: Qwen3 VL 32B Instruct from the model selector
  3. Start chatting, comparing, or creating with AI

More Qwen Models

Qwen

Qwen: Qwen3.6 Plus Preview (free)

Qwen 3.6 Plus Preview is the next-generation evolution of the Qwen Plus series, featuring an advanced hybrid architecture that improves efficiency and
1000K budget
Qwen

Qwen: Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an effi
256K budget
Qwen

Qwen: Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a spa
262K standard
Qwen

Qwen: Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference
262K standard
Qwen

Qwen: Qwen3.5-122B-A10B

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixtur
262K standard
Qwen

Qwen: Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-
1000K budget

Try Qwen: Qwen3 VL 32B Instruct Now

Chat with Qwen: Qwen3 VL 32B Instruct and 300+ other AI models — all in one app.

Download App → Try on Web App