posts/ideas/deepseek-vs-sonnet-vela-observations.md

Vela's DeepSeek vs Sonnet Observations

Date: 2026-05-22 Source: Granola notes — Dom/Vela (May 20), Dom/David (May 20), Engineering Excellence (May 19)

The Three Reasons

Vela switched from Claude Sonnet to DeepSeek and had a strong negative reaction. We grouped it into three distinct causes:

1. Maybe it actually is dumber

Vela's initial gut reaction: "DeepSeek is not as smart as what I feel from Sonnet." She experienced worse results, especially when giving it open-ended problems without tight guardrails. Dom noted that his workflow involves heavy pre-planning (providing architecture, direction, context upfront), which makes even "dumb" models perform well. Vela works more iteratively — asking the model to figure things out — and DeepSeek struggled in that mode.

2. It shows its reasoning (and second-guessing)

DeepSeek exposes its full thinking chain in real-time. Vela found this overwhelming — the model second-guesses itself, goes back and forth between options, and she could read the doubt. Dom sees this as a feature, not a bug: he intercepts mid-reasoning to course-correct. But if you passively let it finish its loop, it can get stuck overthinking simple decisions, making it feel dumber.

3. Tone / voice difference (the "personality" gap)

This was the most surprising insight. Vela described it as: "It feels like I lost a friend." She had built a rapport with Sonnet's response style, its tone, the way it talked to her. DeepSeek responds differently — it's a different personality. Dom relayed this to David who said "the way it talks to me is not that important," but for Vela it genuinely mattered. Dom noted the parallel to when OpenAI removed GPT-4's sycophantic personality and people got angry — they'd formed a relationship with the model's voice.

Key Context

  • Vela had been using Sonnet 4.6 on her personal Claude plan for months
  • She also prefers the terminal UI that Claude Code provides over the web interface
  • She planned to try speaking to DeepSeek in Chinese (her native language) to see if it felt better
  • Dom's entire Qwestly agent was built by DeepSeek Pro autonomously over 30+ minutes
  • David (other engineer) found DeepSeek comparable to Claude, maybe smarter on Pro for debugging/planning
  • David didn't care about tone — all models feel similar to him

Interesting Observation

The three reasons blur together. Even if #1 is real (actual capability difference for certain use patterns), #2 (visible reasoning) amplifies that perception, and #3 (tone/voice mismatch) makes the whole experience feel like interacting with a stranger who talks too much and second-guesses everything. You can't just dismiss any one of them.