# Bookworm 路由与消歧管线技术参考 > 版本:v6.6.0 · 最后更新:2026-04-25 > 对应代码:hooks/prompt-dispatcher.js, hooks/route-interceptor-bundle.js, scripts/route-engine.js, scripts/route-analyzer.js, scripts/intent-classifier.js, scripts/disambiguation-tree.js, scripts/adaptive-disambiguator.js, scripts/bwr-builder.js, scripts/route-state.js ## 一、整体管线 ```mermaid flowchart TD U[用户输入 prompt] --> PD[prompt-dispatcher.js
UserPromptSubmit 入口] PD --> SG[security-startup-guard
同步 fail-open] SG --> SUB[spawn route-interceptor-bundle
子进程 2s 硬超时] SUB --> RB[route-interceptor-bundle.js] RB --> BN[showActivationBanner
首条消息横幅] RB --> ES{逃生舱
/force /checks /reset} ES -->|命中| OUT0[直接输出] ES -->|未命中| IF[隐式反馈检测
3min 窗口 · 30 字头] IF --> SC{显式调用
/skill-name} SC -->|命中| OUT0 SC -->|未命中| IC[intent-classifier
三级分流] IC --> LVL{complexity?} LVL -->|simple| INH{尝试继承
5min 窗口} LVL -->|medium 短 CJK
<6 字| INH LVL -->|image query| INH LVL -->|medium/complex| RE[runRouteEngine] INH -->|继承成功| WRS INH -->|继承失败 & CJK 3-14| RE INH -->|纯 simple| OUT0 RE --> BM25[1.BM25 baseline] BM25 --> CTX[2.Context tracker] CTX --> PRJ[3.Project detector] PRJ --> WF[4.Workflow 30d bigram] WF --> DC[5.Domain classifier L1
conf ≥ 0.3 缩候选] DC --> SEM[6.Semantic TF-IDF] SEM --> FUS[7.融合权重
bm25 0.40 sem 0.30 ctx 0.15 proj 0.10 wf 0.05] FUS --> DL2[8.Domain L2 精排
域外 ×0.2 / ×0.5] DL2 --> EMB[9.Embedding tie-breaker
top-2 gap < 10%] EMB --> DIS[applyDisambiguation
88 条硬规则投票] DIS --> ADA[adaptive-disambiguator
Bayesian 0.3 + 硬规则 0.7] ADA --> CS[冷启动 boost] CS --> RR[rerank top-k] RR --> NORM[normalize + confidence cap
短查询 ≤3 token 封顶 0.8] NORM --> SM[session-memory boost] SM --> AB[ab-test 探索] AB --> WRS[writeRouteState
route-state-current.json] WRS --> BWR[buildBWRDirective] BWR --> OUT[additionalContext 注入] OUT --> HK2[route-compliance-gate
PreToolUse:Skill] HK2 --> HK3[subagent-route-injector
SubagentStart] HK3 --> HK4[route-auditor
Stop 会话结束闭环] ``` ## 二、核心数据契约 ### 2.1 route-state-current.json (DEBUG_DIR) ```jsonc { "traceId": "8d4fb84f", "ts": "2026-04-25T01:30:00Z", "promptHash": "abc123def456", "promptRaw": "<脱敏首 200 字>", "intent": { "intents": ["data"], "modifiers": [], "entities": [], "complexity": "medium" }, "routing": { "primary": "vue-expert", "candidates": [{ "name": "vue-expert", "confidence": 1.0 }, ...], "confidence": 1.0, "chain": [], "experiment": null, "domain": null }, "recommendation": { "action": "route", "skill": "vue-expert" }, "mustInvoke": true, "version": "v6.2", "sessionId": "..." } ``` 消费者:route-compliance-gate / subagent-route-injector / route-auditor。 ### 2.2 BWR 指令结构(写入 additionalContext) 末行决策表(`scripts/bwr-builder.js:22-57`): | 条件 | 输出末行 | |---|---| | `simple && !inherited` | `[BWR:skip] 简单查询,直接回复` | | `complexity == complex` | `[MUST_INVOKE_SKILL: {primary}]` 强制 | | 意图 ∈ EXEMPT (translate/explain/greeting/meta/remember/continue/select/confirm) | `执行: 使用 /{primary}` 豁免 | | `medium && conf>=0.5 && primary ≠ developer-expert/none` | `[MUST_INVOKE_SKILL: {primary}]` 中度强制 | | 其他 | `执行: 使用 /{primary}` | ## 三、意图分类三级分流 (`intent-classifier.js`) 14 条意图规则 + 3 条修饰符。复杂度决策树: ``` modifiers.complex → complex modifiers._force_medium (V-01) → medium modifiers.simple → simple intents.length == 2 & 非相邻对 → complex intents.length >= 3 (排除 general) → complex 全在 {explain,general,continue,select,confirm} & 无实体 → simple 其他 → medium ``` **V-01 修复**:`confirm/continue/select` 前缀后接 >8 字符或转折词 → 移除前缀标签重分类。 ## 四、消歧两层机制 ### 4.1 硬规则层 `scripts/disambiguation-rules.json` (v1.5.0, 88 条) 规则结构: ```jsonc { "id": "R84", "note": "...", "trigger": "正则", "boost": "target-skill", "penalty": ["skill1", "skill2"], "weight": 0.55, "mutual_exclusion": { "with": "R05", "on_keyword": "...", "resolution": "..." }, "agent": "self-auditor" // 可选,标注 agent 型路由 } ``` `applyDisambiguation` 三阶段 (`route-analyzer.js:745-828`): 1. **Phase 1**:收集所有命中规则的 boost/penalty 投票,`effectiveWeight = weight × (0.5 + specificity × 0.5)` 2. **Phase 1.5**:mutual_exclusion 消解 3. **Phase 2**:分数应用 — `boost: score = base × (1+boost)`,`penalty: score = base × (1 - penalty × 0.3)`(仅未 boost 时) 4. **Phase 3**:强制排名 — boost 技能必须位于其 penalty 对手之前 ### 4.2 自适应层 `scripts/adaptive-disambiguator.js` (Bayesian Dirichlet) - 对每对 (skillA, skillB) 维护 α 向量 - 硬规则 boost 给 α=10 强先验,其他 α=1 - 收敛:`totalSamples ≥ 30 && posterior ≥ 0.80` - 融合:`bayesian × 0.3 + hardRule × 0.7` - C3_DIRICHLET_HARDENING:softmax-lite 归一化防 ±1 饱和 ## 五、关键韧性设计 | 机制 | 位置 | 作用 | |---|---|---| | 2000ms 硬超时 | route-interceptor-bundle L159 | 超时静默退出不阻断 prompt | | 磁盘断路器 | route-compliance-gate L41-92 | `<100MB` 跳过门控 | | 原子写入 | route-state.js / fusion-weights | `tmp + rename` 防半写入 | | fail-open vs fail-close | 路由层 open / 门控层 close | 安全组件异常拒绝,路由异常放行 | | 5min 继承窗口 | route-interceptor-bundle L301 | 衰减 ×0.7 + 置信度阈值 0.5 | | 短查询封顶 | route-engine L280-292 | ≤3 token 置信度 ≤ 0.8,防 BM25 过拟合 | | TraceId 贯穿 | crypto.randomUUID().slice(0,8) | 全管线日志关联 | ## 六、已知缺陷 ### D1:`applyDisambiguation` boost 目标 score=0 时降级为纯 penalty **现象**:R81/R84/R85/R86 的 boost 目标 `self-auditor` 是 agent,若其 keywords 不覆盖查询词(如"路由/消歧/管线"),BM25 原始分为 0,`route-analyzer.js:765` 的 `results.find(r => r.name === rule.boost && r.score > 0)` 返回 undefined,整条规则降级为仅 penalty。 **影响**:2026-04-25 端到端测试中,"booworm路由和消歧模块技术梳理" 最终路由到 `industry-research-cn` 而非 `self-auditor`,penalty 仅让 vue-expert 降了 4%,不足以让 self-auditor 上位。 **三种修复路径**: - (a) **补 keywords**(治本,工作量小):在 `agents/self-auditor.md` frontmatter 补 "路由/消歧/管线/钩子/注入器/遥测" 等元词,重跑 `generate-skill-index.js`。 - (b) **改 applyDisambiguation**(治标,影响面大):当 firedRules 含强规则 (weight ≥ 0.5) 且 boost 目标 score=0 时,强制注入分数 = top × 1.05 上位。 - (c) **增强 penalty 强度**:把 penalty 衰减因子从 `0.3` 提到 `0.6-0.8`,允许纯 penalty 也有决胜力。 ### D2:路由日志中 `candidates:[]` 的盲点 `route-blind-spots.jsonl` 140 条全是 `confidence=0 & candidates=[]`,即 simple/continue 类查询继承失败落到兜底。这不是路由算法缺陷,但掩盖了真正的误判案例。**真正误判应从 `route-YYYY-MM-DD.jsonl` 扫描 `topResult != developer-expert & confidence >= 0.9 & 语义不匹配`**。 ## 七、调优/审计常用命令 ```bash # 查看消歧规则冲突 node scripts/disambiguation-tree.js # 意图分类调试 node scripts/intent-classifier.js "我的查询" # 端到端路由分析 node scripts/route-analyzer.js --json "我的查询" # 融合权重状态 cat debug/fusion-weights.json # 今日路由统计 cat debug/route-stats-daily-$(date +%F).jsonl | jq -s 'group_by(.skill) | map({skill: .[0].skill, n: length, avgConf: (map(.confidence) | add / length)}) | sort_by(-.n)' # 自适应消歧学习状态 node scripts/adaptive-disambiguator.js --state ``` ## 八、变更记录 | 日期 | 版本 | 变更 | |---|---|---| | 2026-04-25 | v1.5.0 | R84-R88 Bookworm 元词路由修复,发现 D1 缺陷 | | 2026-04-24 | v1.4.0 | R74-R83 补 preferred_mcp 激活路由绑定 + R81-R83 Bookworm 基础元词 | | 2026-04-19 | — | P0v2 stop-dispatcher 3 批并行 + P1 pre-agent-gate | | 2026-04-16 | — | W1 权重衰减 + C1 原子重置 + C3 Dirichlet softmax 硬化 | | 2026-04-14 | — | read-stdin 双 JSON.parse 修复 + stats 漂移同步 |