300+ AI Models Directory - Browse GPT-5, Claude, Gemini & More

NVIDIA

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a hybrid Transformer-Mamba architecture, c

128K context 12B budget

Minimax

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier

205K context standard

Qwen

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines

262K context 32B budget

Ibm-granite

IBM: Granite 4.0 Micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of models released by IBM. They are fine-tuned for long...

131K context 3B budget

Microsoft

Microsoft: Phi 4 Mini Instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4.

131K context budget

Anthropic

Anthropic: Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s perfor

200K context standard

Qwen

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences

256K context 8B standard

Qwen

Qwen: Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal

256K context 8B budget

OpenAI

OpenAI: o3 Deep Research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost.

200K context premium

OpenAI

OpenAI: o4 Mini Deep Research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds addi

200K context standard

NVIDIA

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG,

131K context 49B budget

Qwen

Qwen: Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex

131K context 30B standard

Qwen

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general mu

262K context 30B budget

OpenAI

OpenAI: GPT-5 Pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instructi

400K context premium

Z-ai

Z.ai: GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex.

203K context standard

Anthropic

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-ben

1000K context premium

DeepSeek

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grai

164K context budget

Thedrummer

TheDrummer: Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

131K context 24B budget

Relace

Relace: Relace Apply 3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can apply updates from GPT-4o, Claude, and others into your files at...

256K context standard

Google

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better

1049K context budget

Qwen

Qwen: Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STE

131K context 235B standard

Qwen

Qwen: Qwen3 VL 235B A22B Instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language

262K context 235B budget

Qwen

Qwen: Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the Janua

262K context standard

Qwen

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...

1000K context 480B standard

OpenAI

OpenAI: GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of compl

400K context standard

DeepSeek

DeepSeek: DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language cons

164K context budget

Qwen

Qwen: Qwen3 Coder Flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...

1000K context budget

Qwen

Qwen: Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code s

262K context 80B budget

Qwen

Qwen: Qwen3 Next 80B A3B Instruct (free)

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code

262K context 80B budget

Qwen

Qwen: Qwen3 Next 80B A3B Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code

262K context 80B standard

Qwen

Qwen: Qwen Plus 0728 (thinking)

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

1000K context budget

Qwen

Qwen: Qwen Plus 0728

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

1000K context budget

NVIDIA

NVIDIA: Nemotron Nano 9B V2 (free)

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and..

128K context 9B budget

Moonshotai

MoonshotAI: Kimi K2 0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters

262K context standard

Qwen

Qwen: Qwen3 30B A3B Thinking 2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking m

131K context 30B budget

Nousresearch

Nous: Hermes 4 70B

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...

131K context 70B budget

Nousresearch

Nous: Hermes 4 405B

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...

131K context 405B standard

DeepSeek

DeepSeek: DeepSeek V3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase

164K context 671B budget

Mistral AI

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced opera

131K context standard

Z-ai

Z.ai: GLM 4.5V

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-

66K context 106B standard

Ai21

AI21: Jamba Large 1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 2

256K context standard

OpenAI

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

128K context standard

OpenAI

OpenAI: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction f

400K context standard

OpenAI

OpenAI: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency an

400K context standard

OpenAI

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to

400K context budget

OpenAI

OpenAI: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par

131K context 117B budget

OpenAI

OpenAI: gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par

131K context 117B budget

OpenAI

OpenAI: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimiz

131K context 21B budget

Browse 300+ AI Models

NVIDIA: Nemotron Nano 12B 2 VL (free)

MiniMax: MiniMax M2

Qwen: Qwen3 VL 32B Instruct

IBM: Granite 4.0 Micro

Microsoft: Phi 4 Mini Instruct

Anthropic: Claude Haiku 4.5

Qwen: Qwen3 VL 8B Thinking

Qwen: Qwen3 VL 8B Instruct

OpenAI: o3 Deep Research

OpenAI: o4 Mini Deep Research

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Qwen: Qwen3 VL 30B A3B Thinking

Qwen: Qwen3 VL 30B A3B Instruct

OpenAI: GPT-5 Pro

Z.ai: GLM 4.6

Anthropic: Claude Sonnet 4.5

DeepSeek: DeepSeek V3.2 Exp

TheDrummer: Cydonia 24B V4.1

Relace: Relace Apply 3

Google: Gemini 2.5 Flash Lite Preview 09-2025

Qwen: Qwen3 VL 235B A22B Thinking

Qwen: Qwen3 VL 235B A22B Instruct

Qwen: Qwen3 Max

Qwen: Qwen3 Coder Plus

OpenAI: GPT-5 Codex

DeepSeek: DeepSeek V3.1 Terminus

Qwen: Qwen3 Coder Flash

Qwen: Qwen3 Next 80B A3B Thinking

Qwen: Qwen3 Next 80B A3B Instruct (free)

Qwen: Qwen3 Next 80B A3B Instruct

Qwen: Qwen Plus 0728 (thinking)

Qwen: Qwen Plus 0728

NVIDIA: Nemotron Nano 9B V2 (free)

MoonshotAI: Kimi K2 0905

Qwen: Qwen3 30B A3B Thinking 2507

Nous: Hermes 4 70B

Nous: Hermes 4 405B

DeepSeek: DeepSeek V3.1

Mistral: Mistral Medium 3.1

Z.ai: GLM 4.5V

AI21: Jamba Large 1.7

OpenAI: GPT-5 Chat

OpenAI: GPT-5

OpenAI: GPT-5 Mini

OpenAI: GPT-5 Nano

OpenAI: gpt-oss-120b (free)

OpenAI: gpt-oss-120b

OpenAI: gpt-oss-20b (free)

Popular AI Model Comparisons

Try Any AI Model Instantly