30%
Observed average token reduction in current production traffic.
Use this page to verify the benchmark claim and copy the request shape.
30%
Observed average token reduction in current production traffic.
49%
Highest reduction observed in the benchmark set.
90%
Highest reduction observed on an individual call.
API
Use your API key with a standard OpenAI-compatible client. Most integrations only need a new base URL.
https://api-infer.agentsey.ai/v1
POST https://api-infer.agentsey.ai/v1/chat/completions
Authorization: Bearer zr_your_api_key
Content-Type: application/json
{
"model": "<configured-model>",
"messages": [{ "role": "user", "content": "Hello" }]
}Benchmarks
The latest eight-dataset Rust report comes in at 22.04% overall. Best benchmark result: 49%.
Personal Claude Code Data
8,724 prompts
25.84%
Dataclaw
2,372 prompts
39.80%
Agentic Code Dataset 22
17,611 prompts
5.93%
SWE-bench / SWE-smith trajectories
2,849,278 prompts
28.45%
Nebius SWE-agent trajectories
2,115,624 prompts
11.93%
OpenHands SFT trajectories
9,630 prompts
22.99%
CodeChat V2.0
1,116,303 prompts
20.06%
Claude Multiround Chat 30k
96,510 prompts
0.97%