代号转生成为魔塔
52.35M · 2026-04-02
Claude源代码泄露事件闹得沸沸扬扬,其实在现代做到这种程度的泄露并不容易,因为防御节点实在太多了,高级的审查手段不说,只说基本的,它的发布流程缺乏哪些基础设施级别的东西?
那么好,流程是几乎没有的,实现怎么样呢?打开/src,最底下两个文件就已经足够扎眼睛了,Tool.ts和tools.ts,打开一看都是几百行的文件,俩文件加起来一千多行,目录浅的文件这样起名,可以说等于没有名字,任何零碎的功能型文件都可以叫tool、tools、util、utils、common,当它达到了大几百行乃至千行规模,就没有人能说清楚这些东西是干什么的,或者说这些东西已经无所不能,建议这种文件都直白一点,坦坦荡荡的叫god.ts好了,没必要假装低调,当然这还是我误解了,Tool.ts和tools.ts都不是god(另有其人),再仔细翻看,发现Tool.ts骨骼极其惊奇,在定义区反复横跳,import type和export type交替循环,能看到许多这样的注释
// Import permission types from centralized location to break import cycles
// Re-export progress types for backwards compatibility
// Import tool permission types from centralized location to break import cycles
// Re-export for backwards compatibility
// Apply DeepImmutable to the imported type
// Re-export ToolProgressData from centralized location
正式的逻辑还没开始,直到最后一行定义
export type AnyObject = z.ZodType<{ [key: string]: unknown }>
经典AnyObject,何必这样呢?折磨TS类型系统,折磨lint,折磨AI,最终也折磨自己,大大方方的直接写any,没必要这样内耗。最后翻到后面在逻辑区域中还真穿插了这样的定义
type AnyToolDef = ToolDef<any, any, any>
再仔细一搜全局: any、<any、, any、as any,好家伙,这项目本身就已经写了近百个any,原来是anyscript,那没事了。
回到这个文件,在import/export就折腾了三百多行,中间还穿插了两个有做具体实现的getEmptyToolPermissionContext()、filterToolProgressMessages(),尽管非常简短,但这就给人一种随处皆可做顶层import的感觉,其实也能理解,这就是 循环依赖(Circular Dependency) 恶果的直接体现,接着是一个单类型export type Tool<...> = {/*此中三百多行*/},但是没有任何实现,让人怀疑这难道不应该是一个.d.ts?再往下,终于有值得讨论的地方
type BuiltTool<D> = Omit<D, DefaultableToolKeys> & {
[K in DefaultableToolKeys]-?: K extends keyof D
? undefined extends D[K]
? ToolDefaults[K]
: D[K]
: ToolDefaults[K]
}
不错,思路很好,库喜欢这么写,在做默认参数merge时还精心的保留了字面量类型,尽管无视了undefined做prop正式value不用默认值的可能性(这种语义的可能性确实较低),但问题是这个BuiltTool没export,只有一个buildTool()用,典型的狗尾续貂,用途面如此狭窄,那不如老老实实的做可读性
type BuiltTool<D extends ToolDef> =
Omit<Tool, DefaultableToolKeys> &
Required<Pick<Tool, DefaultableToolKeys>>
// runtime已有诚实的表达
export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> {
return {
...TOOL_DEFAULTS,
userFacingName: () => def.name,
...def,
} as BuiltTool<D>
}
搭配看就一目了然,当然这样的类型完善错配(许多关键的类型完善差,少数细节类型上堆砌雕花)不是最让人感叹的地方。
翻到UI设计实现方面,首先看这个UI的欢迎页面用到的两个在logoV2Utils.ts中的函数:
/**
* Calculates optimal left panel width based on content
*/
export function calculateOptimalLeftWidth(
welcomeMessage: string,
truncatedCwd: string,
modelLine: string,
): number {
const contentWidth = Math.max(
stringWidth(welcomeMessage),
stringWidth(truncatedCwd),
stringWidth(modelLine),
20, // Minimum for clawd art
)
return Math.min(contentWidth + 4, MAX_LEFT_WIDTH) // +4 for padding
}
/**
* Formats the welcome message based on username
*/
export function formatWelcomeMessage(username: string | null): string {
if (!username || username.length > MAX_USERNAME_LENGTH) {
return 'Welcome back!'
}
return `Welcome back ${username}!`
}
这个calculateOptimalLeftWidth只有一个地方用,而且第一个参数固定为formatWelcomeMessage的返回值,这两个函数都让人皱眉,在用“展示文本”反推“布局尺寸”,经典的反模式和伪需求!可以内容驱动布局,但不能在毫无语义优先级的情况下做这种事,这段代码无法回答welcomeMessage和cwd 哪个更重要?modelLine 可以被截断吗?写的人可能根本没有回答的勇气,而且这个stringWidth其实就是Bun.stringWidth,它是Zig FFI调用的,不是自动响应式的,没办法跟着外部变化一起变,这让Claude客户端里两百多处调用stringWidth的地方就像两百多个呆子。
系统中定义了海量的常量,但仍然出现了20、4这种魔法数,注意这不是在tsx内,而是作为一个utils离散文件,什么阿猫阿狗都在约束UI,常量也就忍了,字面量都来了,让AI改就是两头堵,AI要是敢改,那就太大胆了,经典的为一个页面破坏全局;要是AI不敢改,试图外面套translate/zoom/margin负值偏移等等手段,那无疑就是在堆屎山,再让AI把utils字面量抽出常量,完美,屎山变水泥山,这下给夯实了。
再看这套UI的耳朵ScrollKeybindingHandler.tsx,wheel微调难做是公认的,看看它怎么做:
export function computeWheelStep(state: WheelAccelState, dir: 1 | -1, now: number): number {
if (!state.xtermJs) {
// Device-switch guard ①: idle disengage. Runs BEFORE pendingFlip resolve
// so a pending bounce (28% of last-mouse-events) doesn't bypass it via
// the real-reversal early return. state.time is either the last committed
// event OR the deferred flip — both count as "last activity".
if (state.wheelMode && now - state.time > WHEEL_MODE_IDLE_DISENGAGE_MS) {
state.wheelMode = false;
state.burstCount = 0;
state.mult = state.base;
}
// Resolve any deferred flip BEFORE touching state.time/dir — we need the
// pre-flip state.dir to distinguish bounce (flip-back) from real reversal
// (flip persisted), and state.time (= bounce timestamp) for the gap check.
if (state.pendingFlip) {
state.pendingFlip = false;
if (dir !== state.dir || now - state.time > WHEEL_BOUNCE_GAP_MAX_MS) {
// Real reversal: new dir persisted, OR flip-back arrived too late.
// Commit. The deferred event's 1 row is lost (acceptable latency).
state.dir = dir;
state.time = now;
state.mult = state.base;
return Math.floor(state.mult);
}
// Bounce confirmed: flipped back to original dir within the window.
// state.dir/mult unchanged from pre-bounce. state.time was advanced to
// the bounce below, so gap here = flip-back interval — reflects the
// user's actual click cadence (bounce IS a physical click, just noisy).
state.wheelMode = true;
}
const gap = now - state.time;
if (dir !== state.dir && state.dir !== 0) {
// Flip. Defer — next event decides bounce vs. real reversal. Advance
// time (but NOT dir/mult): if this turns out to be a bounce, the
// confirm event's gap will be the flip-back interval, which reflects
// the user's actual click rate. The bounce IS a physical wheel click,
// just misread by the encoder — it should count toward cadence.
state.pendingFlip = true;
state.time = now;
return 0;
}
state.dir = dir;
state.time = now;
// ─── MOUSE (wheel mode, sticky until device-switch signal) ───
if (state.wheelMode) {
if (gap < WHEEL_BURST_MS) {
// Same-batch burst check (ported from xterm.js): iTerm2 proportional
// reporting sends 2+ SGR events for one detent when macOS gives
// delta>1. Without this, the 2nd event at gap<1ms has m≈1 → STEP*m=15
// → one gentle click gives 1+15=16 rows.
//
// Device-switch guard ②: trackpad flick produces 100+ events at <5ms
// (measured); mouse produces ≤3. 5+ consecutive → trackpad flick.
if (++state.burstCount >= 5) {
state.wheelMode = false;
state.burstCount = 0;
state.mult = state.base;
} else {
return 1;
}
} else {
state.burstCount = 0;
}
}
// Re-check: may have disengaged above.
if (state.wheelMode) {
// xterm.js decay curve with STEP×3, higher cap. No idle threshold —
// the curve handles it (gap=1000ms → m≈0.01 → mult≈1). No frac —
// rounding loss is minor at high mult, and frac persisting across idle
// was causing off-by-one on the first click back.
const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
const cap = Math.max(WHEEL_MODE_CAP, state.base * 2);
const next = 1 + (state.mult - 1) * m + WHEEL_MODE_STEP * m;
state.mult = Math.min(cap, next, state.mult + WHEEL_MODE_RAMP);
return Math.floor(state.mult);
}
// ─── TRACKPAD / HI-RES (native, non-wheel-mode) ───
// Tight 40ms burst window: sub-40ms events ramp, anything slower resets.
// Trackpad flick delivers 200+ events at <20ms gaps → rails to cap 6.
// Trackpad slow swipe at 40-400ms gaps → resets every event → 1 row each.
if (gap > WHEEL_ACCEL_WINDOW_MS) {
state.mult = state.base;
} else {
const cap = Math.max(WHEEL_ACCEL_MAX, state.base * 2);
state.mult = Math.min(cap, state.mult + WHEEL_ACCEL_STEP);
}
return Math.floor(state.mult);
}
// ─── VSCODE (xterm.js, browser wheel events) ───
// Browser wheel events — no encoder bounce, no SGR bursts. Decay curve
// unchanged from the original tuning. Same formula shape as wheel mode
// above (keep in sync) but STEP=5 not 15 — higher event rate here.
const gap = now - state.time;
const sameDir = dir === state.dir;
state.time = now;
state.dir = dir;
// xterm.js path. Debug log shows two patterns: (a) 20-50ms gaps during
// sustained scroll (~30 Hz), (b) <5ms same-batch bursts on flicks. For
// (b) give 1 row/event — the burst count IS the acceleration, same as
// native. For (a) the decay curve gives 3-5 rows. For sparse events
// (100ms+, slow deliberate scroll) the curve gives 1-3.
if (sameDir && gap < WHEEL_BURST_MS) return 1;
if (!sameDir || gap > WHEEL_DECAY_IDLE_MS) {
// Direction reversal or long idle: start at 2 (not 1) so the first
// click after a pause moves a visible amount. Without this, idle-
// then-resume in the same direction decays to mult≈1 (1 row).
state.mult = 2;
state.frac = 0;
} else {
const m = Math.pow(0.5, gap / WHEEL_DECAY_HALFLIFE_MS);
const cap = gap >= WHEEL_DECAY_GAP_MS ? WHEEL_DECAY_CAP_SLOW : WHEEL_DECAY_CAP_FAST;
state.mult = Math.min(cap, 1 + (state.mult - 1) * m + WHEEL_DECAY_STEP * m);
}
const total = state.mult + state.frac;
const rows = Math.floor(total);
state.frac = total - rows;
return rows;
}
乍一看注释比代码多,有股AI味,但仔细看注释这应该是经过了真实设备测试微调的,比较用心,但问题在于算法本身是“专业调参级别”,但工程落地是“违章搭建级别”。很多人觉得魔法数变常量,似乎就不是魔法数了,毕竟能解释清了,但问题是到了这种精调的地步,这些常量已经没有修改空间了,成了事实上的数字咒语,结合注释“28% of last-mouse-events”、“acceptable latency”,显然这些是基于特定测试环境得出的经验值。换一台电脑、换一个鼠标驱动,这些百分比可能全都不成立。
一个函数做MOUSE和xterm两个场景,可以说雄心勃勃,但也可以基于这一点判断这项目没有单元测试的概念。而且这种复杂行为,应该定义一个清晰的行为模型,辅以类型约束,那就很容易理解,可惜这还是停留在命令式编程的逻辑里,没抽象出行为模型,自然也没办法演进。不断 mutate state + 多点 return,这是僵尸函数的经典特征。
再从细节看基于xterm.js的微调非常脆弱,一旦xterm修复了问题,这必须快速跟上对齐,不然一定是一个海量用户明显感知的bug,这意味着更新时必须要肉眼观察xterm release changelog里有没有类似Improve wheel scrolling behavior的东西,这不现实,让AI来监控这种极度模糊的信号,误报/漏报那更是互相折磨。当然不是完全没招,那就是用无头浏览器做仿真单元测试,跑行为录制,然后做视频自动化对比,但引入那套体系,简直是为了复活剑齿虎,先复活一头猛犸象来预备环境。
最让人抓狂的是,这个函数被放在ScrollKeybindingHandler.tsx里。Handler 的职责应该是:“用户按下了键 -> 触发命令”。现在的职责变成了:“用户滚动了轮子 -> 计算物理加速度 -> 模拟惯性 -> 区分设备 -> 输出步长 -> 触发命令”,搞的这个文件臃肿不堪(1000行),后面还有个更恐怖的useDragToScroll()没法细看了,试图在一个useEffect里面维护一套有stop(),tick(),start(),check()的驱动setInterval的复杂调度器,尝试的勇气值得肯定,但别尝试了,专业的事交给专业的库,这种活在应用层解决是何苦呢,另外仔细想想,这么普遍的需求,怎么可能还需要应用层去实现?这一点其实非但很多开发者没想好,其实AI都很难理解。或者说大家败给了时间,希望在没能快速找到现有方案的情况下自己实现,但通常低估了这种‘简单问题’的难度,在缺乏像素级调试的库环境中做这种深度的模拟,BUG丛生是必然。
另外还有5000多行的REPL.tsx(由300行的import、200行的TranscriptModeFooter+TranscriptSearchBar+AnimatedTerminalTitle组件,4500行的REPL组件构成),2000多行的PromptInput.tsx和ManagePlugins.tsx(都是单一组件,但也都一口气两千行,各自上百个state),留着后续慢慢吐槽