Token Optimization

原创于 2026-06-16 15:08:00 发布 · 322 阅读

5 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

rtk
headroom
- 工具作用
- 使用方法

rtk

工具作用

rtk 可以将一些命令的输出进行精简，如下：

Rust 编译过滤器(高保真，语义级精简)
这些是硬编码在 src/cmds/ 中的 Rust 过滤器，优先级最高。按语言/生态划分：

生态	覆盖命令
Git	`git diff/log/status`、`gh`、`glab`、`gt`
Cloud	`aws`、`docker`、`curl`、`psql`、`wget`
JS/TS	`npm`、`pnpm`、`next`、`playwright`、`prettier`、`prisma`、`tsc`、`vitest`、`eslint`
JVM	`gradle`、`maven`
Python	`mypy`、`pip`、`pytest`、`ruff`
Ruby	`rake`、`rspec`、`rubocop`
Rust	`cargo`
.NET	`dotnet build/test/format`
Go	`go`、`golangci-lint`
System	`ls`、`find`、`grep`、`wc`、`env`、`tree`、`cat/read`、`jq` 等

这些过滤器利用 Rust 的类型系统、serde 反序列化和状态机来执行跨行关联、JSON 解析、结构化表格提取等操作——这些都是纯正则无法胜任的。

TOML DSL 过滤器(声明式，正则级精简)

定义在 src/filters/ 中，目前有 50+ 个 TOML 文件，覆盖：
- 构建工具：make、gradle、just、task、turbo、nx
- 包管理：brew、poetry、composer、bundle、uv-sync、mise
- Lint/格式化：shellcheck、yamllint、hadolint、markdownlint、biome、oxlint
- 基础设施：terraform plan、helm、tofu、ansible、docker (df/du)、iptables、ssh、rsync
- 其他：ping、ps、df、du、gcloud、jira、sops、xcodebuild、swift-build 等
每个 TOML 过滤器走一条 8 阶段确定性管道：strip_ansi → replace → match_output → strip/keep_lines → truncate → tail_lines → max_lines → on_empty。全是正则匹配 + 文本操作，零 LLM 参与。
降级策略

如果某个命令既没有 Rust 过滤器也没有 TOML 规则匹配，则走 Passthrough(原样透传，不做任何精简)。

但是注意，rtk 是没有过滤的功能的，也就是说 rtk 不能像语义搜索那样子，将无关的输出过滤掉。如果最开始 Agent 的命令比较广泛，匹配到了很多目标，那么 rtk 也无能为力

使用方法

安装到~/.local/bin：curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh
- 如果网络无法连通，可以将文件复制下来，然后将DOWNLOAD_URL和CHECKSUMS_URL换成https://ghfast.top/https://github.com/...
设置 PYTHONPATH：export PATH="$HOME/.local/bin:$PATH"
验证安装是否成功：rtk --version && rtk gain
为 Agent 添加提示词：RTK is a token-optimized CLI proxy for shell commands. Always prefix shell commands with rtk. For example, use rtk git status or rtk cargo test instead of git status or cargo test.

自定义精简

跟AGENTS.md一样，可以设置全局的精简规则~/.config/rtk/filters.toml和项目级别的精简规则.rtk/filters.toml
精简规则的作用是在不写 Rust 代码的情况下，为 RTK 不内置支持的命令自定义输出压缩规则

注意事项

Codex 使用的是 sed 读取技能文件的，以前有专门的 UI 显示，不会显示 sed，但是加了 rtk 之后，sed 就会显示出来了

缺点

rtk 是对每一个命令进行各自的精简规则设定的，扩展性肯定就不强
而且目前只对 Claude Code 支持比较良好，剩下的都不太行
也没有办法无脑让所有 Coding Agents 在 Bash 命令前加上 rtk，因为有一些特殊情况：比如连续命令git add . && git commit -m "msg" && git push，还有 Coding Agent 自己的权限规则(比如一些禁止的命令加上了 rtk 都不禁止了)，有些构造也不能够自动重写(比如cat <<EOF)。这些都需要做适配，显然扩展性不强。也没有办法自己去写一个 Hook 来利用 rtk

headroom

工具作用

将工具的输出发送给大模型之前，拦截即将发送的输出，将原始内容缓存到内存中，再使用某种压缩算法压缩原始内容，然后再将压缩后的内容发送给大模型。与 rtk 的区别就是，headroom 能够压缩的命令范围更广泛，并且在 Agent 包装模式中直接继承了 rtk

缓存机制：
- TTL 过期：默认 TLL 是 300s
- 容量淘汰：默认最多存储 1000 条，多余的按照创建时间淘汰最老的条目

压缩算法：

算法	内容类型	策略
SmartCrusher	JSON 数组，嵌套对象	统计性行去重，键压缩
CodeCompressor	源代码(Python, JS, Go, Rust, Java, C++)	基于 AST —— 保留导入、签名和类型
Kompress-base	纯文本，普通文章，文档	ML token 分类(基于智能体轨迹训练的 Moc)
SearchCompressor	grep / ripgrep 结果	结果去重，路径缩短，基于评分的剪枝
LogCompressor	构建输出，测试日志，CI 跟踪记录	错误/异常提取，冗余移除
ImageCompressor	图像，截图	基于 ML 的路由分发 + OCR 提取

发送给大模型的内容：

在系统提示词的末尾添加：

## Compressed Context Available

  Some tool outputs have been compressed to reduce context size. If you need
  the full uncompressed data, you can retrieve it using the `headroom_retrieve` tool.

  **How to retrieve:**
  - Call `headroom_retrieve(hash="<hash>")` to get all original items
  - Call `headroom_retrieve(hash="<hash>", query="search terms")` to search within

  **Available hashes:** a1b2c3d4e5f6a1b2c3d4e5f6, ...

  Look for markers like `[N items compressed to M. Retrieve more: hash=abc123]`
  in tool results to find the hash for each compressed output.

在工具中新添加一个工具ccr_tool:

{
    "type": "function",
    "function": {
  	"name": "headroom_retrieve",
  	"description": "Retrieve original uncompressed content that was compressed to save tokens. ...",
  	"parameters": {
  	  "type": "object",
  	  "properties": {
  		"hash": {
  		  "type": "string",
  		  "description": "Hash key from the compression marker (e.g., 'abc123' from hash=abc123)"
  		},
  		"query": {
  		  "type": "string",
  		  "description": "Optional search query to filter results. If omitted, returns all original items."
  		}
  	  },
  	  "required": ["hash"]
  	}
    }
}