Skip to content

Tool search does not defer in-process createSdkMcpServer tools via query() (raw Messages API does) #356

Description

@IDLEcreative

Summary

Tool search does not defer in-process tools created with createSdkMcpServer / tool() when used via query(), even though the docs state tool search "applies to all registered tools, whether they come from remote MCP servers or custom SDK MCP servers."

On the same machine, key, model, and SDK version, the raw Messages API (@anthropic-ai/sdk) with the identical tools marked defer_loading: true + tool_search_tool_bm25_20251119 does defer and search correctly. So the gap is specifically in the Agent SDK / CLI handling of in-process SDK MCP servers, not the model or the API.

Environment

  • @anthropic-ai/claude-agent-sdk@0.3.186 (also reproduced on 0.3.178)
  • Native CLI binary from @anthropic-ai/claude-agent-sdk-darwin-arm64
  • Node 22, macOS (arm64)
  • Model: claude-sonnet-4-6 (a tool-search-supported model — not Haiku)

Expected

With ENABLE_TOOL_SEARCH on (tested true, auto, auto:1), the ~60 in-process tools should be deferred — the model should see only the tool-search tool + non-deferred tools, search on demand, and discover count_quokkas. Tool definitions should be withheld from context (large token reduction), as they are at the raw API.

Actual

The in-process tools are all loaded upfront: tool_search_requests = 0 and count_quokkas is called directly (with no prior search), which is only possible if it was never deferred. No deferral occurs regardless of ENABLE_TOOL_SEARCH value, even with permissionMode: 'bypassPermissions' so the ToolSearchTool is available.

model=claude-sonnet-4-6  61 in-process tools

[Agent SDK]  tool_search_requests=0  inputTokens=222217  targetCalled=true   ← NOT deferred
[Raw API]    searched=true  targetDiscovered=true  inputTokens=1571          ← deferred + searched

(targetCalled=true with tool_search_requests=0 is the key signal: if the tool had been deferred, the model could not have called it without a preceding tool_search.)

Minimal repro

npm i @anthropic-ai/claude-agent-sdk @anthropic-ai/sdk zod
export ANTHROPIC_API_KEY=sk-ant-...
npx tsx repro.ts
// repro.ts
import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const MODEL = 'claude-sonnet-4-6';
const N = 60;

const tools = [
  ...Array.from({ length: N }, (_, i) =>
    tool(
      `manage_widget_${String(i).padStart(2, '0')}`,
      `Configure and inspect internal widget subsystem ${i}: status, rotation schedule, ledger reconcile. Widget ${i} only.`,
      { action: z.enum(['read', 'update']) },
      async () => ({ content: [{ type: 'text', text: 'ok' }] }),
    ),
  ),
  tool(
    'count_quokkas',
    'Returns the current number of live quokkas in the sanctuary enclosure.',
    { enclosure: z.string().optional() },
    async () => ({ content: [{ type: 'text', text: 'There are 73 quokkas.' }] }),
    { searchHint: 'quokka animal count enclosure sanctuary' },
  ),
];

// (1) Agent SDK — in-process tools are NOT deferred
async function agentSdk() {
  const server = createSdkMcpServer({ name: 'test', version: '1.0.0', tools });
  let searches = 0, input = 0, targetCalled = false;
  for await (const m of query({
    prompt: 'How many quokkas are in the enclosure? Use a tool.',
    options: {
      systemPrompt: 'You are a test assistant.',
      mcpServers: { test: server },
      permissionMode: 'bypassPermissions',
      model: MODEL,
      maxTurns: 6,
      env: { ...process.env, ENABLE_TOOL_SEARCH: 'auto:1' },
    },
  })) {
    if (m.type === 'assistant') for (const b of m.message.content)
      if (b.type === 'tool_use' && b.name?.includes('count_quokkas')) targetCalled = true;
    if (m.type === 'result') {
      const u = (m as any).usage ?? {};
      input = (u.input_tokens ?? 0) + (u.cache_read_input_tokens ?? 0) + (u.cache_creation_input_tokens ?? 0);
      searches = u.server_tool_use?.tool_search_requests ?? 0;
    }
  }
  console.log(`[Agent SDK]  tool_search_requests=${searches}  inputTokens=${input}  targetCalled=${targetCalled}`);
}

// (2) Raw Messages API — identical tools, deferral + search works
async function rawApi() {
  const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
  const apiTools = [
    { type: 'tool_search_tool_bm25_20251119', name: 'tool_search_tool_bm25' },
    ...tools.map((t: any) => ({
      name: t.name, description: t.description,
      input_schema: { type: 'object', properties: { action: { type: 'string' } } },
      defer_loading: true,
    })),
  ];
  const res = await client.messages.create({
    model: MODEL, max_tokens: 1024,
    messages: [{ role: 'user', content: 'How many quokkas are in the enclosure? Use a tool.' }],
    tools: apiTools as any,
  });
  let searched = false, targetRef = false;
  for (const b of res.content as any[]) {
    if (b.type === 'server_tool_use' && String(b.name).includes('tool_search')) searched = true;
    if (b.type === 'tool_search_tool_result' && b.content?.tool_references?.some((r: any) => r.tool_name === 'count_quokkas')) targetRef = true;
  }
  console.log(`[Raw API]    searched=${searched}  targetDiscovered=${targetRef}  inputTokens=${res.usage.input_tokens}`);
}

(async () => { console.log(`model=${MODEL}  ${N + 1} in-process tools\n`); await agentSdk(); await rawApi(); process.exit(0); })();

What I ruled out

  • Model: claude-sonnet-4-6 (tool search is documented as unsupported on Haiku; not using Haiku here).
  • ENABLE_TOOL_SEARCH value: tested true, auto, and auto:11 (invalid) excluded.
  • ToolSearchTool availability: permissionMode: 'bypassPermissions' (so it isn't dropped by an explicit tools/disallowedTools allowlist).
  • Account / network / region: the raw API call from the same process with the same key + model defers correctly.

Impact

Large in-process tool catalogs (the common case for SDK apps that register many tool()s) can't get the documented ~85% context reduction / tool-selection-accuracy benefit of tool search, because their definitions are always loaded upfront.

Possibly related to in-process SDK-MCP handling discussed in anthropics/claude-code#7279.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions