您的位置: 首页> 开发工具

Claude-code Client工程质量为什么低的令人发指？

匿名上传

发布时间:2026-04-02 20:48:02

Claude源代码泄露事件闹得沸沸扬扬，其实在现代做到这种程度的泄露并不容易，因为防御节点实在太多了，高级的审查手段不说，只说基本的，它的发布流程缺乏哪些基础设施级别的东西？

Push Hook：代码提交时自动检查敏感信息，在这里屏蔽.map是最早的，没做。
CI/CD Pipeline：构建时自动扫描包内容，这里可以列举出所有敏感类型文件，包括.map、.tar，一旦查到就可以拒绝输出产物，没做。
Pre-publish Check：发布前的最后一道自动化防线，检查文件大小、内容结构，这里可以做最后一次检查，没做。
双人授权：最重要的复核，发布时对变更区进行整体review，没做。

那么好，流程是几乎没有的，实现怎么样呢？打开/src，最底下两个文件就已经足够扎眼睛了，Tool.ts和tools.ts，打开一看都是几百行的文件，俩文件加起来一千多行，目录浅的文件这样起名，可以说等于没有名字，任何零碎的功能型文件都可以叫tool、tools、util、utils、common，当它达到了大几百行乃至千行规模，就没有人能说清楚这些东西是干什么的，或者说这些东西已经无所不能，建议这种文件都直白一点，坦坦荡荡的叫god.ts好了，没必要假装低调，当然这还是我误解了，Tool.ts和tools.ts都不是god（另有其人），再仔细翻看，发现Tool.ts骨骼极其惊奇，在定义区反复横跳，import type和export type交替循环，能看到许多这样的注释

// Import permission types from centralized location to break import cycles
// Re-export progress types for backwards compatibility
// Import tool permission types from centralized location to break import cycles
// Re-export for backwards compatibility
// Apply DeepImmutable to the imported type
// Re-export ToolProgressData from centralized location

正式的逻辑还没开始，直到最后一行定义

export type AnyObject = z.ZodType<{ [key: string]: unknown }>

经典AnyObject，何必这样呢？折磨TS类型系统，折磨lint，折磨AI，最终也折磨自己，大大方方的直接写any，没必要这样内耗。最后翻到后面在逻辑区域中还真穿插了这样的定义

type AnyToolDef = ToolDef<any, any, any>

再仔细一搜全局: any、<any、, any、as any，好家伙，这项目本身就已经写了近百个any，原来是anyscript，那没事了。

回到这个文件，在import/export就折腾了三百多行，中间还穿插了两个有做具体实现的getEmptyToolPermissionContext()、filterToolProgressMessages()，尽管非常简短，但这就给人一种随处皆可做顶层import的感觉，其实也能理解，这就是 循环依赖(Circular Dependency) 恶果的直接体现，接着是一个单类型export type Tool<...> = {/*此中三百多行*/}，但是没有任何实现，让人怀疑这难道不应该是一个.d.ts？再往下，终于有值得讨论的地方

type BuiltTool<D> = Omit<D, DefaultableToolKeys> & {
  [K in DefaultableToolKeys]-?: K extends keyof D
    ? undefined extends D[K]
      ? ToolDefaults[K]
      : D[K]
    : ToolDefaults[K]
}

不错，思路很好，库喜欢这么写，在做默认参数merge时还精心的保留了字面量类型，尽管无视了undefined做prop正式value不用默认值的可能性（这种语义的可能性确实较低），但问题是这个BuiltTool没export，只有一个buildTool()用，典型的狗尾续貂，用途面如此狭窄，那不如老老实实的做可读性

type BuiltTool<D extends ToolDef> =
  Omit<Tool, DefaultableToolKeys> &
  Required<Pick<Tool, DefaultableToolKeys>>
  
// runtime已有诚实的表达
export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> {
  return {
    ...TOOL_DEFAULTS,
    userFacingName: () => def.name,
    ...def,
  } as BuiltTool<D>
}

搭配看就一目了然，当然这样的类型完善错配（许多关键的类型完善差，少数细节类型上堆砌雕花）不是最让人感叹的地方。

翻到UI设计实现方面，首先看这个UI的欢迎页面用到的两个在logoV2Utils.ts中的函数：

/**
 * Calculates optimal left panel width based on content
 */
export function calculateOptimalLeftWidth(
  welcomeMessage: string,
  truncatedCwd: string,
  modelLine: string,
): number {
  const contentWidth = Math.max(
    stringWidth(welcomeMessage),
    stringWidth(truncatedCwd),
    stringWidth(modelLine),
    20, // Minimum for clawd art
  )
  return Math.min(contentWidth + 4, MAX_LEFT_WIDTH) // +4 for padding
}

/**
 * Formats the welcome message based on username
 */
export function formatWelcomeMessage(username: string | null): string {
  if (!username || username.length > MAX_USERNAME_LENGTH) {
    return 'Welcome back!'
  }
  return `Welcome back ${username}!`
}

这个calculateOptimalLeftWidth只有一个地方用，而且第一个参数固定为formatWelcomeMessage的返回值，这两个函数都让人皱眉，在用“展示文本”反推“布局尺寸”，经典的反模式和伪需求！可以内容驱动布局，但不能在毫无语义优先级的情况下做这种事，这段代码无法回答welcomeMessage和cwd 哪个更重要？modelLine 可以被截断吗？写的人可能根本没有回答的勇气，而且这个stringWidth其实就是Bun.stringWidth，它是Zig FFI调用的，不是自动响应式的，没办法跟着外部变化一起变，这让Claude客户端里两百多处调用stringWidth的地方就像两百多个呆子。

系统中定义了海量的常量，但仍然出现了20、4这种魔法数，注意这不是在tsx内，而是作为一个utils离散文件，什么阿猫阿狗都在约束UI，常量也就忍了，字面量都来了，让AI改就是两头堵，AI要是敢改，那就太大胆了，经典的为一个页面破坏全局；要是AI不敢改，试图外面套translate/zoom/margin负值偏移等等手段，那无疑就是在堆屎山，再让AI把utils字面量抽出常量，完美，屎山变水泥山，这下给夯实了。

再看这套UI的耳朵ScrollKeybindingHandler.tsx，wheel微调难做是公认的，看看它怎么做：

export function computeWheelStep(state: WheelAccelState, dir: 1 | -1, now: number): number {
  if (!state.xtermJs) {
    // Device-switch guard ①: idle disengage. Runs BEFORE pendingFlip resolve
    // so a pending bounce (28% of last-mouse-events) doesn't bypass it via
    // the real-reversal early return. state.time is either the last committed
    // event OR the deferred flip — both count as "last activity".
    if (state.wheelMode && now - state.time > WHEEL_MODE_IDLE_DISENGAGE_MS) {
      state.wheelMode = false;
      state.burstCount = 0;
      state.mult = state.base;
    }

    // Resolve any deferred flip BEFORE touching state.time/dir — we need the
    // pre-flip state.dir to distinguish bounce (flip-back) from real reversal
    // (flip persisted), and state.time (= bounce timestamp) for the gap check.
    if (state.pendingFlip) {
      state.pendingFlip = false;
      if (dir !== state.dir || now - state.time > WHEEL_BOUNCE_GAP_MAX_MS) {
        // Real reversal: new dir persisted, OR flip-back arrived too late.
        // Commit. The deferred event's 1 row is lost (acceptable latency).
        state.dir = dir;
        state.time = now;
        state.mult = state.base;
        return Math.floor(state.mult);
      }
      // Bounce confirmed: flipped back to original dir within the window.
      // state.dir/mult unchanged from pre-bounce. state.time was advanced to
      // the bounce below, so gap here = flip-back interval — reflects the
      // user's actual click cadence (bounce IS a physical click, just noisy).
      state.wheelMode = true;
    }
    const gap = now - state.time;
    if (dir !== state.dir && state.dir !== 0) {
      // Flip. Defer — next event decides bounce vs. real reversal. Advance
      // time (but NOT dir/mult): if this turns out to be a bounce, the
      // confirm event's gap will be the flip-back interval, which reflects
      // the user's actual click rate. The bounce IS a physical wheel click,
      // just misread by the encoder — it should count toward cadence.
      state.pendingFlip = true;
      state.time = now;
      return 0;
    }
    state.dir = dir;
    state.time = now;

    // ─── MOUSE (wheel mode, sticky until device-switch signal) ───
    if (state.wheelMode) {
      if (gap < WHEEL_BURST_MS) {
        // Same-batch burst check (ported from xterm.js): iTerm2 proportional
        // reporting sends 2+ SGR events for one detent when macOS gives
        // delta>1. Without this, the 2nd event at gap<1ms has m≈1 → STEP*m=15
        // → one gentle click gives 1+15=16 rows.
        //
        // Device-switch guard ②: trackpad flick produces 100+ events at <5ms
        // (measured); mouse produces ≤3. 5+ consecutive → trackpad flick.
        if (++state.burstCount >= 5) {
          state.wheelMode = false;
          state.burstCount = 0;
          state.mult = state.base;
        } else {
          return 1;
        }
      } else {
        state.burstCount = 0;
      }
    }
    // Re-check: may have disengaged above.
    if (state.wheelMode) {
      // xterm.js decay curve with STEP×3, higher cap. No idle threshold —
      // the curve handles it (gap=1000ms → m≈0.01 → mult≈1). No frac —
      // rounding loss is minor at high mult, and frac persisting across idle
      // was causing off-by-one on the first click back.
      const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
      const cap = Math.max(WHEEL_MODE_CAP, state.base * 2);
      const next = 1 + (state.mult - 1) * m + WHEEL_MODE_STEP * m;
      state.mult = Math.min(cap, next, state.mult + WHEEL_MODE_RAMP);
      return Math.floor(state.mult);
    }

    // ─── TRACKPAD / HI-RES (native, non-wheel-mode) ───
    // Tight 40ms burst window: sub-40ms events ramp, anything slower resets.
    // Trackpad flick delivers 200+ events at <20ms gaps → rails to cap 6.
    // Trackpad slow swipe at 40-400ms gaps → resets every event → 1 row each.
    if (gap > WHEEL_ACCEL_WINDOW_MS) {
      state.mult = state.base;
    } else {
      const cap = Math.max(WHEEL_ACCEL_MAX, state.base * 2);
      state.mult = Math.min(cap, state.mult + WHEEL_ACCEL_STEP);
    }
    return Math.floor(state.mult);
  }

  // ─── VSCODE (xterm.js, browser wheel events) ───
  // Browser wheel events — no encoder bounce, no SGR bursts. Decay curve
  // unchanged from the original tuning. Same formula shape as wheel mode
  // above (keep in sync) but STEP=5 not 15 — higher event rate here.
  const gap = now - state.time;
  const sameDir = dir === state.dir;
  state.time = now;
  state.dir = dir;
  // xterm.js path. Debug log shows two patterns: (a) 20-50ms gaps during
  // sustained scroll (~30 Hz), (b) <5ms same-batch bursts on flicks. For
  // (b) give 1 row/event — the burst count IS the acceleration, same as
  // native. For (a) the decay curve gives 3-5 rows. For sparse events
  // (100ms+, slow deliberate scroll) the curve gives 1-3.
  if (sameDir && gap < WHEEL_BURST_MS) return 1;
  if (!sameDir || gap > WHEEL_DECAY_IDLE_MS) {
    // Direction reversal or long idle: start at 2 (not 1) so the first
    // click after a pause moves a visible amount. Without this, idle-
    // then-resume in the same direction decays to mult≈1 (1 row).
    state.mult = 2;
    state.frac = 0;
  } else {
    const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
    const cap = gap >= WHEEL_DECAY_GAP_MS ? WHEEL_DECAY_CAP_SLOW : WHEEL_DECAY_CAP_FAST;
    state.mult = Math.min(cap, 1 + (state.mult - 1) * m + WHEEL_DECAY_STEP * m);
  }
  const total = state.mult + state.frac;
  const rows = Math.floor(total);
  state.frac = total - rows;
  return rows;
}

乍一看注释比代码多，有股AI味，但仔细看注释这应该是经过了真实设备测试微调的，比较用心，但问题在于算法本身是“专业调参级别”，但工程落地是“违章搭建级别”。很多人觉得魔法数变常量，似乎就不是魔法数了，毕竟能解释清了，但问题是到了这种精调的地步，这些常量已经没有修改空间了，成了事实上的数字咒语，结合注释“28% of last-mouse-events”、“acceptable latency”，显然这些是基于特定测试环境得出的经验值。换一台电脑、换一个鼠标驱动，这些百分比可能全都不成立。

一个函数做MOUSE和xterm两个场景，可以说雄心勃勃，但也可以基于这一点判断这项目没有单元测试的概念。而且这种复杂行为，应该定义一个清晰的行为模型，辅以类型约束，那就很容易理解，可惜这还是停留在命令式编程的逻辑里，没抽象出行为模型，自然也没办法演进。不断 mutate state + 多点 return，这是僵尸函数的经典特征。

再从细节看基于xterm.js的微调非常脆弱，一旦xterm修复了问题，这必须快速跟上对齐，不然一定是一个海量用户明显感知的bug，这意味着更新时必须要肉眼观察xterm release changelog里有没有类似Improve wheel scrolling behavior的东西，这不现实，让AI来监控这种极度模糊的信号，误报/漏报那更是互相折磨。当然不是完全没招，那就是用无头浏览器做仿真单元测试，跑行为录制，然后做视频自动化对比，但引入那套体系，简直是为了复活剑齿虎，先复活一头猛犸象来预备环境。

最让人抓狂的是，这个函数被放在ScrollKeybindingHandler.tsx里。Handler 的职责应该是：“用户按下了键 -> 触发命令”。现在的职责变成了：“用户滚动了轮子 -> 计算物理加速度 -> 模拟惯性 -> 区分设备 -> 输出步长 -> 触发命令”，搞的这个文件臃肿不堪（1000行），后面还有个更恐怖的useDragToScroll()没法细看了，试图在一个useEffect里面维护一套有stop(),tick(),start(),check()的驱动setInterval的复杂调度器，尝试的勇气值得肯定，但别尝试了，专业的事交给专业的库，这种活在应用层解决是何苦呢，另外仔细想想，这么普遍的需求，怎么可能还需要应用层去实现？这一点其实非但很多开发者没想好，其实AI都很难理解。或者说大家败给了时间，希望在没能快速找到现有方案的情况下自己实现，但通常低估了这种‘简单问题’的难度，在缺乏像素级调试的库环境中做这种深度的模拟，BUG丛生是必然。

另外还有5000多行的REPL.tsx（由300行的import、200行的TranscriptModeFooter+TranscriptSearchBar+AnimatedTerminalTitle组件，4500行的REPL组件构成），2000多行的PromptInput.tsx和ManagePlugins.tsx（都是单一组件，但也都一口气两千行，各自上百个state），留着后续慢慢吐槽