Claude源代码泄露事件闹得沸沸扬扬,其实在现代做到这种程度的泄露并不容易,因为防御节点实在太多了,高级的审查手段不说,只说基本的,它的发布流程缺乏哪些基础设施级别的东西?

  • Push Hook:代码提交时自动检查敏感信息,在这里屏蔽.map是最早的,没做。
  • CI/CD Pipeline:构建时自动扫描包内容,这里可以列举出所有敏感类型文件,包括.map、.tar,一旦查到就可以拒绝输出产物,没做。
  • Pre-publish Check:发布前的最后一道自动化防线,检查文件大小、内容结构,这里可以做最后一次检查,没做。
  • 双人授权:最重要的复核,发布时对变更区进行整体review,没做。

那么好,流程是几乎没有的,实现怎么样呢?打开/src,最底下两个文件就已经足够扎眼睛了,Tool.tstools.ts,打开一看都是几百行的文件,俩文件加起来一千多行,目录浅的文件这样起名,可以说等于没有名字,任何零碎的功能型文件都可以叫tooltoolsutilutilscommon,当它达到了大几百行乃至千行规模,就没有人能说清楚这些东西是干什么的,或者说这些东西已经无所不能,建议这种文件都直白一点,坦坦荡荡的叫god.ts好了,没必要假装低调,当然这还是我误解了,Tool.tstools.ts都不是god(另有其人),再仔细翻看,发现Tool.ts骨骼极其惊奇,在定义区反复横跳,import typeexport type交替循环,能看到许多这样的注释

// Import permission types from centralized location to break import cycles
// Re-export progress types for backwards compatibility
// Import tool permission types from centralized location to break import cycles
// Re-export for backwards compatibility
// Apply DeepImmutable to the imported type
// Re-export ToolProgressData from centralized location

正式的逻辑还没开始,直到最后一行定义

export type AnyObject = z.ZodType<{ [key: string]: unknown }>

经典AnyObject,何必这样呢?折磨TS类型系统,折磨lint,折磨AI,最终也折磨自己,大大方方的直接写any,没必要这样内耗。最后翻到后面在逻辑区域中还真穿插了这样的定义

type AnyToolDef = ToolDef<any, any, any>

再仔细一搜全局: any<any, anyas any,好家伙,这项目本身就已经写了近百个any,原来是anyscript,那没事了。

回到这个文件,在import/export就折腾了三百多行,中间还穿插了两个有做具体实现的getEmptyToolPermissionContext()filterToolProgressMessages(),尽管非常简短,但这就给人一种随处皆可做顶层import的感觉,其实也能理解,这就是 循环依赖(Circular Dependency) 恶果的直接体现,接着是一个单类型export type Tool<...> = {/*此中三百多行*/},但是没有任何实现,让人怀疑这难道不应该是一个.d.ts?再往下,终于有值得讨论的地方

type BuiltTool<D> = Omit<D, DefaultableToolKeys> & {
  [K in DefaultableToolKeys]-?: K extends keyof D
    ? undefined extends D[K]
      ? ToolDefaults[K]
      : D[K]
    : ToolDefaults[K]
}

不错,思路很好,库喜欢这么写,在做默认参数merge时还精心的保留了字面量类型,尽管无视了undefined做prop正式value不用默认值的可能性(这种语义的可能性确实较低),但问题是这个BuiltToolexport,只有一个buildTool()用,典型的狗尾续貂,用途面如此狭窄,那不如老老实实的做可读性

type BuiltTool<D extends ToolDef> =
  Omit<Tool, DefaultableToolKeys> &
  Required<Pick<Tool, DefaultableToolKeys>>
  
// runtime已有诚实的表达
export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> {
  return {
    ...TOOL_DEFAULTS,
    userFacingName: () => def.name,
    ...def,
  } as BuiltTool<D>
}

搭配看就一目了然,当然这样的类型完善错配(许多关键的类型完善差,少数细节类型上堆砌雕花)不是最让人感叹的地方。

翻到UI设计实现方面,首先看这个UI的欢迎页面用到的两个在logoV2Utils.ts中的函数:

/**
 * Calculates optimal left panel width based on content
 */
export function calculateOptimalLeftWidth(
  welcomeMessage: string,
  truncatedCwd: string,
  modelLine: string,
): number {
  const contentWidth = Math.max(
    stringWidth(welcomeMessage),
    stringWidth(truncatedCwd),
    stringWidth(modelLine),
    20, // Minimum for clawd art
  )
  return Math.min(contentWidth + 4, MAX_LEFT_WIDTH) // +4 for padding
}

/**
 * Formats the welcome message based on username
 */
export function formatWelcomeMessage(username: string | null): string {
  if (!username || username.length > MAX_USERNAME_LENGTH) {
    return 'Welcome back!'
  }
  return `Welcome back ${username}!`
}

这个calculateOptimalLeftWidth只有一个地方用,而且第一个参数固定为formatWelcomeMessage的返回值,这两个函数都让人皱眉,在用“展示文本”反推“布局尺寸”,经典的反模式和伪需求!可以内容驱动布局,但不能在毫无语义优先级的情况下做这种事,这段代码无法回答welcomeMessage和cwd 哪个更重要?modelLine 可以被截断吗?写的人可能根本没有回答的勇气,而且这个stringWidth其实就是Bun.stringWidth,它是Zig FFI调用的,不是自动响应式的,没办法跟着外部变化一起变,这让Claude客户端里两百多处调用stringWidth的地方就像两百多个呆子。

系统中定义了海量的常量,但仍然出现了204这种魔法数,注意这不是在tsx内,而是作为一个utils离散文件,什么阿猫阿狗都在约束UI,常量也就忍了,字面量都来了,让AI改就是两头堵,AI要是敢改,那就太大胆了,经典的为一个页面破坏全局;要是AI不敢改,试图外面套translate/zoom/margin负值偏移等等手段,那无疑就是在堆屎山,再让AI把utils字面量抽出常量,完美,屎山变水泥山,这下给夯实了。

再看这套UI的耳朵ScrollKeybindingHandler.tsx,wheel微调难做是公认的,看看它怎么做:

export function computeWheelStep(state: WheelAccelState, dir: 1 | -1, now: number): number {
  if (!state.xtermJs) {
    // Device-switch guard ①: idle disengage. Runs BEFORE pendingFlip resolve
    // so a pending bounce (28% of last-mouse-events) doesn't bypass it via
    // the real-reversal early return. state.time is either the last committed
    // event OR the deferred flip — both count as "last activity".
    if (state.wheelMode && now - state.time > WHEEL_MODE_IDLE_DISENGAGE_MS) {
      state.wheelMode = false;
      state.burstCount = 0;
      state.mult = state.base;
    }

    // Resolve any deferred flip BEFORE touching state.time/dir — we need the
    // pre-flip state.dir to distinguish bounce (flip-back) from real reversal
    // (flip persisted), and state.time (= bounce timestamp) for the gap check.
    if (state.pendingFlip) {
      state.pendingFlip = false;
      if (dir !== state.dir || now - state.time > WHEEL_BOUNCE_GAP_MAX_MS) {
        // Real reversal: new dir persisted, OR flip-back arrived too late.
        // Commit. The deferred event's 1 row is lost (acceptable latency).
        state.dir = dir;
        state.time = now;
        state.mult = state.base;
        return Math.floor(state.mult);
      }
      // Bounce confirmed: flipped back to original dir within the window.
      // state.dir/mult unchanged from pre-bounce. state.time was advanced to
      // the bounce below, so gap here = flip-back interval — reflects the
      // user's actual click cadence (bounce IS a physical click, just noisy).
      state.wheelMode = true;
    }
    const gap = now - state.time;
    if (dir !== state.dir && state.dir !== 0) {
      // Flip. Defer — next event decides bounce vs. real reversal. Advance
      // time (but NOT dir/mult): if this turns out to be a bounce, the
      // confirm event's gap will be the flip-back interval, which reflects
      // the user's actual click rate. The bounce IS a physical wheel click,
      // just misread by the encoder — it should count toward cadence.
      state.pendingFlip = true;
      state.time = now;
      return 0;
    }
    state.dir = dir;
    state.time = now;

    // ─── MOUSE (wheel mode, sticky until device-switch signal) ───
    if (state.wheelMode) {
      if (gap < WHEEL_BURST_MS) {
        // Same-batch burst check (ported from xterm.js): iTerm2 proportional
        // reporting sends 2+ SGR events for one detent when macOS gives
        // delta>1. Without this, the 2nd event at gap<1ms has m≈1 → STEP*m=15
        // → one gentle click gives 1+15=16 rows.
        //
        // Device-switch guard ②: trackpad flick produces 100+ events at <5ms
        // (measured); mouse produces ≤3. 5+ consecutive → trackpad flick.
        if (++state.burstCount >= 5) {
          state.wheelMode = false;
          state.burstCount = 0;
          state.mult = state.base;
        } else {
          return 1;
        }
      } else {
        state.burstCount = 0;
      }
    }
    // Re-check: may have disengaged above.
    if (state.wheelMode) {
      // xterm.js decay curve with STEP×3, higher cap. No idle threshold —
      // the curve handles it (gap=1000ms → m≈0.01 → mult≈1). No frac —
      // rounding loss is minor at high mult, and frac persisting across idle
      // was causing off-by-one on the first click back.
      const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
      const cap = Math.max(WHEEL_MODE_CAP, state.base * 2);
      const next = 1 + (state.mult - 1) * m + WHEEL_MODE_STEP * m;
      state.mult = Math.min(cap, next, state.mult + WHEEL_MODE_RAMP);
      return Math.floor(state.mult);
    }

    // ─── TRACKPAD / HI-RES (native, non-wheel-mode) ───
    // Tight 40ms burst window: sub-40ms events ramp, anything slower resets.
    // Trackpad flick delivers 200+ events at <20ms gaps → rails to cap 6.
    // Trackpad slow swipe at 40-400ms gaps → resets every event → 1 row each.
    if (gap > WHEEL_ACCEL_WINDOW_MS) {
      state.mult = state.base;
    } else {
      const cap = Math.max(WHEEL_ACCEL_MAX, state.base * 2);
      state.mult = Math.min(cap, state.mult + WHEEL_ACCEL_STEP);
    }
    return Math.floor(state.mult);
  }

  // ─── VSCODE (xterm.js, browser wheel events) ───
  // Browser wheel events — no encoder bounce, no SGR bursts. Decay curve
  // unchanged from the original tuning. Same formula shape as wheel mode
  // above (keep in sync) but STEP=5 not 15 — higher event rate here.
  const gap = now - state.time;
  const sameDir = dir === state.dir;
  state.time = now;
  state.dir = dir;
  // xterm.js path. Debug log shows two patterns: (a) 20-50ms gaps during
  // sustained scroll (~30 Hz), (b) <5ms same-batch bursts on flicks. For
  // (b) give 1 row/event — the burst count IS the acceleration, same as
  // native. For (a) the decay curve gives 3-5 rows. For sparse events
  // (100ms+, slow deliberate scroll) the curve gives 1-3.
  if (sameDir && gap < WHEEL_BURST_MS) return 1;
  if (!sameDir || gap > WHEEL_DECAY_IDLE_MS) {
    // Direction reversal or long idle: start at 2 (not 1) so the first
    // click after a pause moves a visible amount. Without this, idle-
    // then-resume in the same direction decays to mult≈1 (1 row).
    state.mult = 2;
    state.frac = 0;
  } else {
    const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
    const cap = gap >= WHEEL_DECAY_GAP_MS ? WHEEL_DECAY_CAP_SLOW : WHEEL_DECAY_CAP_FAST;
    state.mult = Math.min(cap, 1 + (state.mult - 1) * m + WHEEL_DECAY_STEP * m);
  }
  const total = state.mult + state.frac;
  const rows = Math.floor(total);
  state.frac = total - rows;
  return rows;
}

乍一看注释比代码多,有股AI味,但仔细看注释这应该是经过了真实设备测试微调的,比较用心,但问题在于算法本身是“专业调参级别”,但工程落地是“违章搭建级别”。很多人觉得魔法数变常量,似乎就不是魔法数了,毕竟能解释清了,但问题是到了这种精调的地步,这些常量已经没有修改空间了,成了事实上的数字咒语,结合注释“28% of last-mouse-events”、“acceptable latency”,显然这些是基于特定测试环境得出的经验值。换一台电脑、换一个鼠标驱动,这些百分比可能全都不成立。

一个函数做MOUSExterm两个场景,可以说雄心勃勃,但也可以基于这一点判断这项目没有单元测试的概念。而且这种复杂行为,应该定义一个清晰的行为模型,辅以类型约束,那就很容易理解,可惜这还是停留在命令式编程的逻辑里,没抽象出行为模型,自然也没办法演进。不断 mutate state + 多点 return,这是僵尸函数的经典特征。

再从细节看基于xterm.js的微调非常脆弱,一旦xterm修复了问题,这必须快速跟上对齐,不然一定是一个海量用户明显感知的bug,这意味着更新时必须要肉眼观察xterm release changelog里有没有类似Improve wheel scrolling behavior的东西,这不现实,让AI来监控这种极度模糊的信号,误报/漏报那更是互相折磨。当然不是完全没招,那就是用无头浏览器做仿真单元测试,跑行为录制,然后做视频自动化对比,但引入那套体系,简直是为了复活剑齿虎,先复活一头猛犸象来预备环境。

最让人抓狂的是,这个函数被放在ScrollKeybindingHandler.tsx里。Handler 的职责应该是:“用户按下了键 -> 触发命令”。现在的职责变成了:“用户滚动了轮子 -> 计算物理加速度 -> 模拟惯性 -> 区分设备 -> 输出步长 -> 触发命令”,搞的这个文件臃肿不堪(1000行),后面还有个更恐怖的useDragToScroll()没法细看了,试图在一个useEffect里面维护一套有stop(),tick(),start(),check()的驱动setInterval的复杂调度器,尝试的勇气值得肯定,但别尝试了,专业的事交给专业的库,这种活在应用层解决是何苦呢,另外仔细想想,这么普遍的需求,怎么可能还需要应用层去实现?这一点其实非但很多开发者没想好,其实AI都很难理解。或者说大家败给了时间,希望在没能快速找到现有方案的情况下自己实现,但通常低估了这种‘简单问题’的难度,在缺乏像素级调试的库环境中做这种深度的模拟,BUG丛生是必然。

另外还有5000多行的REPL.tsx(由300行的import、200行的TranscriptModeFooter+TranscriptSearchBar+AnimatedTerminalTitle组件,4500行的REPL组件构成),2000多行的PromptInput.tsxManagePlugins.tsx(都是单一组件,但也都一口气两千行,各自上百个state),留着后续慢慢吐槽

本站提供的所有下载资源均来自互联网,仅提供学习交流使用,版权归原作者所有。如需商业使用,请联系原作者获得授权。 如您发现有涉嫌侵权的内容,请联系我们 邮箱:alixiixcom@163.com