Browse 300+ AI Models

Explore the complete directory of AI models from all major providers. Find the perfect AI for coding, writing, analysis, and more.

All Models (331)Ai21 (1)Aion-labs (4)Allenai (1)Amazon (5)Anthracite-org (1)Anthropic (15)Arcee-ai (4)Baidu (1)Bytedance (1)Bytedance-seed (4)Cognitivecomputations (1)Cohere (5)Deepcogito (1)DeepSeek (11)Google (28)
NVIDIA

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, c
128K context 12B budget
Minimax

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier
205K context standard
Qwen

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines
262K context 32B budget
Ibm-granite

IBM: Granite 4.0 Micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...
131K context 3B budget
Microsoft

Microsoft: Phi 4 Mini Instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4.
131K context budget
Anthropic

Anthropic: Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s perfor
200K context standard
Qwen

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences
256K context 8B standard
Qwen

Qwen: Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal
256K context 8B budget
OpenAI

OpenAI: o3 Deep Research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.
200K context premium
OpenAI

OpenAI: o4 Mini Deep Research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds addi
200K context standard
NVIDIA

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG,
131K context 49B budget
Qwen

Qwen: Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex
131K context 30B standard
Qwen

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general mu
262K context 30B budget
OpenAI

OpenAI: GPT-5 Pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instructi
400K context premium
Z-ai

Z.ai: GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex.
203K context standard
Anthropic

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-ben
1000K context premium
DeepSeek

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grai
164K context budget
Thedrummer

TheDrummer: Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.
131K context 24B budget
Relace

Relace: Relace Apply 3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...
256K context standard
Google

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better
1049K context budget
Qwen

Qwen: Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STE
131K context 235B standard
Qwen

Qwen: Qwen3 VL 235B A22B Instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language
262K context 235B budget
Qwen

Qwen: Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the Janua
262K context standard
Qwen

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...
1000K context 480B standard
OpenAI

OpenAI: GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of compl
400K context standard
DeepSeek

DeepSeek: DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language cons
164K context budget
Qwen

Qwen: Qwen3 Coder Flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...
1000K context budget
Qwen

Qwen: Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code s
262K context 80B budget
Qwen

Qwen: Qwen3 Next 80B A3B Instruct (free)

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code
262K context 80B budget
Qwen

Qwen: Qwen3 Next 80B A3B Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code
262K context 80B standard
Qwen

Qwen: Qwen Plus 0728 (thinking)

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.
1000K context budget
Qwen

Qwen: Qwen Plus 0728

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.
1000K context budget
NVIDIA

NVIDIA: Nemotron Nano 9B V2 (free)

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and..
128K context 9B budget
Moonshotai

MoonshotAI: Kimi K2 0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters
262K context standard
Qwen

Qwen: Qwen3 30B A3B Thinking 2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking m
131K context 30B budget
Nousresearch

Nous: Hermes 4 70B

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...
131K context 70B budget
Nousresearch

Nous: Hermes 4 405B

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...
131K context 405B standard
DeepSeek

DeepSeek: DeepSeek V3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase
164K context 671B budget
Mistral AI

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced opera
131K context standard
Z-ai

Z.ai: GLM 4.5V

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-
66K context 106B standard
Ai21

AI21: Jamba Large 1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 2
256K context standard
OpenAI

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.
128K context standard
OpenAI

OpenAI: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction f
400K context standard
OpenAI

OpenAI: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency an
400K context standard
OpenAI

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to
400K context budget
OpenAI

OpenAI: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par
131K context 117B budget
OpenAI

OpenAI: gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par
131K context 117B budget
OpenAI

OpenAI: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimiz
131K context 21B budget

Popular AI Model Comparisons

Try Any AI Model Instantly

Chat with GPT-5, Claude, Gemini, and 300+ models — all in one app. Compare responses side-by-side.

Download App → Try on Web App