Inception

Inception: Mercury 2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).

Context Window
128K tokens
Parameters
N/A
Input Price
$0.25/1M
Output Price
$0.75/1M
Price Tier
budget
Provider
Inception

How to Use Inception: Mercury 2

With CoreAI, you can start chatting with Inception: Mercury 2 instantly — no separate subscription needed. CoreAI bundles access to Inception: Mercury 2 along with 300+ other AI models from Inception and other providers like OpenAI, Anthropic, Google, Meta, and more.

  1. Download the CoreAI app for iOS, Android, or use the Web App
  2. Select Inception: Mercury 2 from the model selector
  3. Start chatting, comparing, or creating with AI

More Inception Models

Inception

Inception: Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even
128K budget
Inception

Inception: Mercury Coder

Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster tha
128K budget

Try Inception: Mercury 2 Now

Chat with Inception: Mercury 2 and 300+ other AI models — all in one app.

Download App → Try on Web App