AI/deepseek.md

DeepSeek

Models

DeepSeek V4 Flash

Field Value
Model ID deepseek-chat (or deepseek-v4-flash)
Input $0.14 / M tokens
Output $0.28 / M tokens
Context 1M tokens
Role Budget / high-volume — default workhorse. Good for everyday coding, chat, iteration.
Cached input $0.014 / M tokens (with prefix-cache hit)

DeepSeek V4 Pro

Field Value
Model ID deepseek-reasoner (or deepseek-v4-pro)
Input $0.435 / M tokens
Output $0.87 / M tokens
Context 1M tokens
Role Open-weight frontier — stronger reasoning, deeper analysis. Use for complex architecture, debugging, multi-step planning.
Cached input ~$0.0435 / M tokens
Promo 75% off until May 31, 2026

For reference, the pricing table (as of May 3, 2026) lives in api-keys.md.


Claude Code with DeepSeek backend

Claude Code can use DeepSeek as its model provider by overriding the Anthropic API base URL. This works because DeepSeek exposes an Anthropic-compatible Messages API endpoint.

Tools are harness-provided, not model-provided

All Claude Code tools (file read/edit/write, bash, glob, grep, git, agents, cron, skills, etc.) are provided by the Claude Code harness — not by the underlying model. When you switch to DeepSeek (or any other provider), you keep the full tool suite. The model handles reasoning and response generation; the harness handles tool execution, permissions, and session management.

Chain: Avante → Claude Code (ACP) → DeepSeek

Our setup routes through three layers:

Avante.nvim → Claude Code ACP wrapper → DeepSeek API (Anthropic-compatible endpoint)
Layer What it does Config
Avante.nvim Neovim AI plugin; spawns Claude Code in ACP mode ~/.config/nvim/lua/plugins/avante.luaclaude_code profile via AVANTE_PROVIDER=claude-code
Claude Code ACP Anthropic's Agent Client Protocol wrapper; exposes Claude Code as an ACP server npx @agentclientprotocol/claude-agent-acp
Claude Code settings Overrides Anthropic endpoints to point at DeepSeek ~/.claude/settings.json

settings.json overrides

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "sk-...",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash[1m]",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

All model slots point to DeepSeek V4 Pro (with Flash as the Haiku fallback). Non-essential traffic is disabled since telemetry/analytics endpoints don't exist on DeepSeek's side.

Important: model name suffix matters

DeepSeek's Anthropic endpoint expects model names with the [1m] suffix (e.g., deepseek-v4-pro[1m]) — this encodes the 1M-token context window. Without the suffix, the API returns an error.

What you lose with a non-Anthropic backend

Feature Works? Notes
Tools (files, bash, search, git, etc.) Yes Harness-provided
Skills Yes Harness-provided
Agents Yes Harness-provided
Prompt caching No DeepSeek uses its own prefix-cache (byte-stable), incompatible with Anthropic's ephemeral cache
Extended thinking No Anthropic-specific feature; DeepSeek has its own reasoning mechanism via reasoning_effort
Non-essential traffic (telemetry, updates) N/A Disabled via CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

Launching

# Direct Claude Code with DeepSeek backend
claude

# Via Avante with the claude-code profile
AVANTE_PROVIDER=claude_code nvim
# or the alias:
vim-claude