Qwen

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.

Context Window
131K tokens
Parameters
30B
Input Price
$0.13/1M
Output Price
$0.52/1M
Price Tier
budget
Provider
Qwen

How to Use Qwen: Qwen3 VL 30B A3B Instruct

With CoreAI, you can start chatting with Qwen: Qwen3 VL 30B A3B Instruct instantly — no separate subscription needed. CoreAI bundles access to Qwen: Qwen3 VL 30B A3B Instruct along with 300+ other AI models from Qwen and other providers like OpenAI, Anthropic, Google, Meta, and more.

  1. Download the CoreAI app for iOS, Android, or use the Web App
  2. Select Qwen: Qwen3 VL 30B A3B Instruct from the model selector
  3. Start chatting, comparing, or creating with AI

More Qwen Models

Qwen

Qwen: Qwen3.6 Plus Preview (free)

Qwen 3.6 Plus Preview is the next-generation evolution of the Qwen Plus series, featuring an advanced hybrid architecture that improves efficiency and
1000K budget
Qwen

Qwen: Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an effi
256K budget
Qwen

Qwen: Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a spa
262K standard
Qwen

Qwen: Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference
262K standard
Qwen

Qwen: Qwen3.5-122B-A10B

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixtur
262K standard
Qwen

Qwen: Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-
1000K budget

Try Qwen: Qwen3 VL 30B A3B Instruct Now

Chat with Qwen: Qwen3 VL 30B A3B Instruct and 300+ other AI models — all in one app.

Download App → Try on Web App