Summary
Tool search does not defer in-process tools created with createSdkMcpServer / tool() when used via query(), even though the docs state tool search "applies to all registered tools, whether they come from remote MCP servers or custom SDK MCP servers."
On the same machine, key, model, and SDK version, the raw Messages API (@anthropic-ai/sdk) with the identical tools marked defer_loading: true + tool_search_tool_bm25_20251119 does defer and search correctly. So the gap is specifically in the Agent SDK / CLI handling of in-process SDK MCP servers, not the model or the API.
Environment
@anthropic-ai/claude-agent-sdk@0.3.186 (also reproduced on 0.3.178)
- Native CLI binary from
@anthropic-ai/claude-agent-sdk-darwin-arm64
- Node 22, macOS (arm64)
- Model:
claude-sonnet-4-6 (a tool-search-supported model — not Haiku)
Expected
With ENABLE_TOOL_SEARCH on (tested true, auto, auto:1), the ~60 in-process tools should be deferred — the model should see only the tool-search tool + non-deferred tools, search on demand, and discover count_quokkas. Tool definitions should be withheld from context (large token reduction), as they are at the raw API.
Actual
The in-process tools are all loaded upfront: tool_search_requests = 0 and count_quokkas is called directly (with no prior search), which is only possible if it was never deferred. No deferral occurs regardless of ENABLE_TOOL_SEARCH value, even with permissionMode: 'bypassPermissions' so the ToolSearchTool is available.
model=claude-sonnet-4-6 61 in-process tools
[Agent SDK] tool_search_requests=0 inputTokens=222217 targetCalled=true ← NOT deferred
[Raw API] searched=true targetDiscovered=true inputTokens=1571 ← deferred + searched
(targetCalled=true with tool_search_requests=0 is the key signal: if the tool had been deferred, the model could not have called it without a preceding tool_search.)
Minimal repro
npm i @anthropic-ai/claude-agent-sdk @anthropic-ai/sdk zod
export ANTHROPIC_API_KEY=sk-ant-...
npx tsx repro.ts
// repro.ts
import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';
const MODEL = 'claude-sonnet-4-6';
const N = 60;
const tools = [
...Array.from({ length: N }, (_, i) =>
tool(
`manage_widget_${String(i).padStart(2, '0')}`,
`Configure and inspect internal widget subsystem ${i}: status, rotation schedule, ledger reconcile. Widget ${i} only.`,
{ action: z.enum(['read', 'update']) },
async () => ({ content: [{ type: 'text', text: 'ok' }] }),
),
),
tool(
'count_quokkas',
'Returns the current number of live quokkas in the sanctuary enclosure.',
{ enclosure: z.string().optional() },
async () => ({ content: [{ type: 'text', text: 'There are 73 quokkas.' }] }),
{ searchHint: 'quokka animal count enclosure sanctuary' },
),
];
// (1) Agent SDK — in-process tools are NOT deferred
async function agentSdk() {
const server = createSdkMcpServer({ name: 'test', version: '1.0.0', tools });
let searches = 0, input = 0, targetCalled = false;
for await (const m of query({
prompt: 'How many quokkas are in the enclosure? Use a tool.',
options: {
systemPrompt: 'You are a test assistant.',
mcpServers: { test: server },
permissionMode: 'bypassPermissions',
model: MODEL,
maxTurns: 6,
env: { ...process.env, ENABLE_TOOL_SEARCH: 'auto:1' },
},
})) {
if (m.type === 'assistant') for (const b of m.message.content)
if (b.type === 'tool_use' && b.name?.includes('count_quokkas')) targetCalled = true;
if (m.type === 'result') {
const u = (m as any).usage ?? {};
input = (u.input_tokens ?? 0) + (u.cache_read_input_tokens ?? 0) + (u.cache_creation_input_tokens ?? 0);
searches = u.server_tool_use?.tool_search_requests ?? 0;
}
}
console.log(`[Agent SDK] tool_search_requests=${searches} inputTokens=${input} targetCalled=${targetCalled}`);
}
// (2) Raw Messages API — identical tools, deferral + search works
async function rawApi() {
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const apiTools = [
{ type: 'tool_search_tool_bm25_20251119', name: 'tool_search_tool_bm25' },
...tools.map((t: any) => ({
name: t.name, description: t.description,
input_schema: { type: 'object', properties: { action: { type: 'string' } } },
defer_loading: true,
})),
];
const res = await client.messages.create({
model: MODEL, max_tokens: 1024,
messages: [{ role: 'user', content: 'How many quokkas are in the enclosure? Use a tool.' }],
tools: apiTools as any,
});
let searched = false, targetRef = false;
for (const b of res.content as any[]) {
if (b.type === 'server_tool_use' && String(b.name).includes('tool_search')) searched = true;
if (b.type === 'tool_search_tool_result' && b.content?.tool_references?.some((r: any) => r.tool_name === 'count_quokkas')) targetRef = true;
}
console.log(`[Raw API] searched=${searched} targetDiscovered=${targetRef} inputTokens=${res.usage.input_tokens}`);
}
(async () => { console.log(`model=${MODEL} ${N + 1} in-process tools\n`); await agentSdk(); await rawApi(); process.exit(0); })();
What I ruled out
- Model:
claude-sonnet-4-6 (tool search is documented as unsupported on Haiku; not using Haiku here).
ENABLE_TOOL_SEARCH value: tested true, auto, and auto:1 — 1 (invalid) excluded.
ToolSearchTool availability: permissionMode: 'bypassPermissions' (so it isn't dropped by an explicit tools/disallowedTools allowlist).
- Account / network / region: the raw API call from the same process with the same key + model defers correctly.
Impact
Large in-process tool catalogs (the common case for SDK apps that register many tool()s) can't get the documented ~85% context reduction / tool-selection-accuracy benefit of tool search, because their definitions are always loaded upfront.
Possibly related to in-process SDK-MCP handling discussed in anthropics/claude-code#7279.
Summary
Tool search does not defer in-process tools created with
createSdkMcpServer/tool()when used viaquery(), even though the docs state tool search "applies to all registered tools, whether they come from remote MCP servers or custom SDK MCP servers."On the same machine, key, model, and SDK version, the raw Messages API (
@anthropic-ai/sdk) with the identical tools markeddefer_loading: true+tool_search_tool_bm25_20251119does defer and search correctly. So the gap is specifically in the Agent SDK / CLI handling of in-process SDK MCP servers, not the model or the API.Environment
@anthropic-ai/claude-agent-sdk@0.3.186(also reproduced on0.3.178)@anthropic-ai/claude-agent-sdk-darwin-arm64claude-sonnet-4-6(a tool-search-supported model — not Haiku)Expected
With
ENABLE_TOOL_SEARCHon (testedtrue,auto,auto:1), the ~60 in-process tools should be deferred — the model should see only the tool-search tool + non-deferred tools, search on demand, and discovercount_quokkas. Tool definitions should be withheld from context (large token reduction), as they are at the raw API.Actual
The in-process tools are all loaded upfront:
tool_search_requests = 0andcount_quokkasis called directly (with no prior search), which is only possible if it was never deferred. No deferral occurs regardless ofENABLE_TOOL_SEARCHvalue, even withpermissionMode: 'bypassPermissions'so theToolSearchToolis available.(
targetCalled=truewithtool_search_requests=0is the key signal: if the tool had been deferred, the model could not have called it without a precedingtool_search.)Minimal repro
npm i @anthropic-ai/claude-agent-sdk @anthropic-ai/sdk zod export ANTHROPIC_API_KEY=sk-ant-... npx tsx repro.tsWhat I ruled out
claude-sonnet-4-6(tool search is documented as unsupported on Haiku; not using Haiku here).ENABLE_TOOL_SEARCHvalue: testedtrue,auto, andauto:1—1(invalid) excluded.ToolSearchToolavailability:permissionMode: 'bypassPermissions'(so it isn't dropped by an explicittools/disallowedToolsallowlist).Impact
Large in-process tool catalogs (the common case for SDK apps that register many
tool()s) can't get the documented ~85% context reduction / tool-selection-accuracy benefit of tool search, because their definitions are always loaded upfront.Possibly related to in-process SDK-MCP handling discussed in anthropics/claude-code#7279.