Skip to content

Python: Fix: coalesce reasoning deltas into single block when content.id is None#6804

Open
PratikWayase wants to merge 1 commit into
microsoft:mainfrom
PratikWayase:fix/ag-ui-reasoning-stream-id
Open

Python: Fix: coalesce reasoning deltas into single block when content.id is None#6804
PratikWayase wants to merge 1 commit into
microsoft:mainfrom
PratikWayase:fix/ag-ui-reasoning-stream-id

Conversation

@PratikWayase

Copy link
Copy Markdown

Motivation & Context

  1. Why is this change required? When a provider (like Ollama) streams model reasoning ("thinking"), it often does not provide a stable content.id per chunk, resulting in content.id=None for every delta.
  2. What problem does it solve? The AG-UI reasoning emitter (_emit_text_reasoning) was calling generate_event_id() on every single delta when content.id was None. This caused the stream to emit a REASONING_START -> CONTENT -> REASONING_END lifecycle per token, instead of one lifecycle per turn.
  3. What scenario does it contribute to? Any AG-UI frontend (e.g., CopilotKit) connected to a "thinking" model (like DeepSeek-R1 via Ollama) was rendering dozens of fragmented "Thought for a few seconds" accordions, with sentences chopped mid-word across them.

Description & Review Guide

  • What are the major changes?
    Changed the message_id assignment logic in _emit_text_reasoning() (_run_common.py). It now reuses the existing flow.reasoning_message_id when content.id is None, falling back to generate_event_id() only on the very first delta of a turn. This brings it into parity with how _emit_text() already handles text deltas via flow.message_id. Added 3 regression tests.
  • What is the impact of these changes?
    Frontends will now correctly receive a single REASONING_START -> (many) REASONING_MESSAGE_CONTENT -> REASONING_END block per turn. This is a non-breaking change; providers that do supply a content.id continue to use it exactly as before.
  • What do you want reviewers to focus on?
    Please verify the 5-line logic swap in _emit_text_reasoning and the 3 new tests in TestReasoningCoalescing to ensure the coalescing behavior correctly mirrors _emit_text.

Remaining Work (Out of Scope for this PR)

The original issue report contained a second, distinct bug regarding parallel tool calls. When Ollama issues the same tool twice in one turn, agent-framework-ollama's _parse_tool_calls_from_ollama reuses the function name as the call_id, causing a collision. That bug is located in agent-framework-ollama and will be addressed in a separate PR.

Related Issue

Partially addresses #6788

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).

Copilot AI review requested due to automatic review settings June 29, 2026 17:56
@moonbox3 moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 29, 2026
@github-actions github-actions Bot changed the title Fix: coalesce reasoning deltas into single block when content.id is None Python: Fix: coalesce reasoning deltas into single block when content.id is None Jun 29, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes AG-UI reasoning streaming so that providers which emit reasoning deltas with content.id=None (e.g., Ollama “thinking”) produce a single REASONING_START → (many) REASONING_MESSAGE_CONTENT → REASONING_END lifecycle per turn, instead of fragmenting into a new lifecycle per delta.

Changes:

  • Updated _emit_text_reasoning() to reuse flow.reasoning_message_id when content.id is absent, only generating a new ID on the first delta of a reasoning block.
  • Added regression tests validating coalescing behavior, explicit content.id behavior, and “new turn/new id” behavior after _close_reasoning_block().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
python/packages/ag-ui/agent_framework_ag_ui/_run_common.py Reuses flow.reasoning_message_id for content.id=None reasoning deltas to prevent per-delta reasoning block fragmentation.
python/packages/ag-ui/tests/ag_ui/test_run_common.py Adds regression coverage ensuring reasoning deltas coalesce into a single block/message_id per turn and reset correctly across turns.

@antsok

antsok commented Jun 30, 2026

Copy link
Copy Markdown

Fixes #6787

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants