Tool search does not defer in-process createSdkMcpServer tools via query() (raw Messages API does)

## Summary

Tool search does **not** defer **in-process** tools created with `createSdkMcpServer` / `tool()` when used via `query()`, even though [the docs](https://code.claude.com/docs/en/agent-sdk/tool-search) state tool search *"applies to all registered tools, whether they come from remote MCP servers or custom SDK MCP servers."*

On the **same machine, key, model, and SDK version**, the **raw Messages API** (`@anthropic-ai/sdk`) with the identical tools marked `defer_loading: true` + `tool_search_tool_bm25_20251119` **does** defer and search correctly. So the gap is specifically in the Agent SDK / CLI handling of in-process SDK MCP servers, not the model or the API.

## Environment

- `@anthropic-ai/claude-agent-sdk@0.3.186` (also reproduced on `0.3.178`)
- Native CLI binary from `@anthropic-ai/claude-agent-sdk-darwin-arm64`
- Node 22, macOS (arm64)
- Model: `claude-sonnet-4-6` (a tool-search-supported model — **not** Haiku)

## Expected

With `ENABLE_TOOL_SEARCH` on (tested `true`, `auto`, `auto:1`), the ~60 in-process tools should be deferred — the model should see only the tool-search tool + non-deferred tools, search on demand, and discover `count_quokkas`. Tool definitions should be withheld from context (large token reduction), as they are at the raw API.

## Actual

The in-process tools are **all loaded upfront**: `tool_search_requests = 0` and `count_quokkas` is called **directly** (with no prior search), which is only possible if it was never deferred. No deferral occurs regardless of `ENABLE_TOOL_SEARCH` value, even with `permissionMode: 'bypassPermissions'` so the `ToolSearchTool` is available.

```
model=claude-sonnet-4-6  61 in-process tools

[Agent SDK]  tool_search_requests=0  inputTokens=222217  targetCalled=true   ← NOT deferred
[Raw API]    searched=true  targetDiscovered=true  inputTokens=1571          ← deferred + searched
```

(`targetCalled=true` with `tool_search_requests=0` is the key signal: if the tool had been deferred, the model could not have called it without a preceding `tool_search`.)

## Minimal repro

```bash
npm i @anthropic-ai/claude-agent-sdk @anthropic-ai/sdk zod
export ANTHROPIC_API_KEY=sk-ant-...
npx tsx repro.ts
```

```ts
// repro.ts
import { query, tool, createSdkMcpServer } from '@anthropic-ai/claude-agent-sdk';
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const MODEL = 'claude-sonnet-4-6';
const N = 60;

const tools = [
  ...Array.from({ length: N }, (_, i) =>
    tool(
      `manage_widget_${String(i).padStart(2, '0')}`,
      `Configure and inspect internal widget subsystem ${i}: status, rotation schedule, ledger reconcile. Widget ${i} only.`,
      { action: z.enum(['read', 'update']) },
      async () => ({ content: [{ type: 'text', text: 'ok' }] }),
    ),
  ),
  tool(
    'count_quokkas',
    'Returns the current number of live quokkas in the sanctuary enclosure.',
    { enclosure: z.string().optional() },
    async () => ({ content: [{ type: 'text', text: 'There are 73 quokkas.' }] }),
    { searchHint: 'quokka animal count enclosure sanctuary' },
  ),
];

// (1) Agent SDK — in-process tools are NOT deferred
async function agentSdk() {
  const server = createSdkMcpServer({ name: 'test', version: '1.0.0', tools });
  let searches = 0, input = 0, targetCalled = false;
  for await (const m of query({
    prompt: 'How many quokkas are in the enclosure? Use a tool.',
    options: {
      systemPrompt: 'You are a test assistant.',
      mcpServers: { test: server },
      permissionMode: 'bypassPermissions',
      model: MODEL,
      maxTurns: 6,
      env: { ...process.env, ENABLE_TOOL_SEARCH: 'auto:1' },
    },
  })) {
    if (m.type === 'assistant') for (const b of m.message.content)
      if (b.type === 'tool_use' && b.name?.includes('count_quokkas')) targetCalled = true;
    if (m.type === 'result') {
      const u = (m as any).usage ?? {};
      input = (u.input_tokens ?? 0) + (u.cache_read_input_tokens ?? 0) + (u.cache_creation_input_tokens ?? 0);
      searches = u.server_tool_use?.tool_search_requests ?? 0;
    }
  }
  console.log(`[Agent SDK]  tool_search_requests=${searches}  inputTokens=${input}  targetCalled=${targetCalled}`);
}

// (2) Raw Messages API — identical tools, deferral + search works
async function rawApi() {
  const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
  const apiTools = [
    { type: 'tool_search_tool_bm25_20251119', name: 'tool_search_tool_bm25' },
    ...tools.map((t: any) => ({
      name: t.name, description: t.description,
      input_schema: { type: 'object', properties: { action: { type: 'string' } } },
      defer_loading: true,
    })),
  ];
  const res = await client.messages.create({
    model: MODEL, max_tokens: 1024,
    messages: [{ role: 'user', content: 'How many quokkas are in the enclosure? Use a tool.' }],
    tools: apiTools as any,
  });
  let searched = false, targetRef = false;
  for (const b of res.content as any[]) {
    if (b.type === 'server_tool_use' && String(b.name).includes('tool_search')) searched = true;
    if (b.type === 'tool_search_tool_result' && b.content?.tool_references?.some((r: any) => r.tool_name === 'count_quokkas')) targetRef = true;
  }
  console.log(`[Raw API]    searched=${searched}  targetDiscovered=${targetRef}  inputTokens=${res.usage.input_tokens}`);
}

(async () => { console.log(`model=${MODEL}  ${N + 1} in-process tools\n`); await agentSdk(); await rawApi(); process.exit(0); })();
```

## What I ruled out

- **Model:** `claude-sonnet-4-6` (tool search is documented as unsupported on Haiku; not using Haiku here).
- **`ENABLE_TOOL_SEARCH` value:** tested `true`, `auto`, and `auto:1` — `1` (invalid) excluded.
- **`ToolSearchTool` availability:** `permissionMode: 'bypassPermissions'` (so it isn't dropped by an explicit `tools`/`disallowedTools` allowlist).
- **Account / network / region:** the raw API call from the same process with the same key + model defers correctly.

## Impact

Large in-process tool catalogs (the common case for SDK apps that register many `tool()`s) can't get the documented ~85% context reduction / tool-selection-accuracy benefit of tool search, because their definitions are always loaded upfront.

Possibly related to in-process SDK-MCP handling discussed in anthropics/claude-code#7279.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool search does not defer in-process createSdkMcpServer tools via query() (raw Messages API does) #356

Summary

Environment

Expected

Actual

Minimal repro

What I ruled out

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Tool search does not defer in-process createSdkMcpServer tools via query() (raw Messages API does) #356

Description

Summary

Environment

Expected

Actual

Minimal repro

What I ruled out

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions