[fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable by lhotari · Pull Request #26025 · apache/pulsar

lhotari · 2026-06-13T16:32:46Z

Motivation

When a topic is loaded, the broker first waits for the namespace's topic policies cache to be
initialized. Initialization (SystemTopicBasedTopicPoliciesService#initPolicesCache) reads the
namespace's __change_events system topic to the end and completes a shared, per-namespace
future (policyCacheInitMap) that every topic load in the namespace awaits.

That shared future could be left pending forever, leaving every topic in the namespace stuck and
unloadable until the broker was restarted (issue #25294), in two distinct ways:

No timeout on the read loop. If the __change_events reader reconnects but then stops making
progress — e.g. after __change_events is unloaded/moved and the reconnected reader gets stuck
(see the compacted-read stuck-reader bug fixed in [fix][broker] Fix compacted read could be stuck forever or message loss due to cursor mark delete #25998) — the read loop never finishes and the
future stays pending. The 60s topicLoadTimeoutSeconds only fails the individual topic-load
future; it does not clear the poisoned policyCacheInitMap entry or close the stuck reader, so the
namespace stays poisoned and every later load times out the same way.
Cleanup paths removed the future without completing it. Several paths drop the entry from
policyCacheInitMap but never complete the future — most importantly the namespace-bundle unload
path (removeOwnedNamespaceBundleAsync). They relied on the reader being closed and the init chain
failing to complete the future indirectly; if that didn't happen, awaiting topic loads hung. This
matches the "futures accumulate until the broker is restarted" symptom in the report.

This is defense-in-depth that is complementary to #25998 (which fixes one concrete stuck-reader
trigger on the broker/cursor side): it guarantees the per-namespace init future is always completed,
so a single stuck/aborted initialization can no longer take a whole namespace's topics down until
restart.

Modifications

Add topicPoliciesCacheInitTimeoutSeconds (default 60, dynamic). It bounds topic policies cache
initialization for a namespace. Set to 0/negative to disable (previous unbounded behavior).
prepareInitPoliciesCacheAsync schedules a timeout that fails the init future. On timeout, an
identity-guarded cleanup (cleanupAfterPolicyCacheInitTimeout) clears the cached state and
closes the stuck reader only when the timed-out future is still the current one — it captures
the reader before the gate and uses policyCacheInitMap.remove(ns, future) /
readerCaches.remove(ns, reader), so a concurrent retry (or an unload that already replaced the
entry) is never clobbered. A new pulsar.broker.topic.policies.cache.init.timeout.count
OpenTelemetry counter records these events. The timeout task is cancelled as soon as initialization
completes, so it adds no overhead on the normal path.
cleanPoliciesCacheInitMap and close() now complete any pending init future they drop
(exceptionally), so awaiting topic loads fail fast and retry with a fresh reader instead of hanging.
The completion is done outside the ConcurrentHashMap#compute remapping function, because
completing the future can run the awaiting topic-load callbacks synchronously and doing that while
holding the bin lock risks a recursive map update / deadlock (the hazard addressed in [Bug] [broker] Concurrent error in SystemTopicBasedTopicPoliciesService#prepareInitPoliciesCacheAsync #24977).

Verifying this change

This change added tests and can be verified as follows:

...TopicPoliciesServiceTest#testPrepareInitPoliciesCacheAsyncTimesOutWhenReaderStuck: spies the
__change_events reader so it reports more events but never delivers one (a stuck reader), then
asserts prepareInitPoliciesCacheAsync fails with a TimeoutException instead of hanging, that the
poisoned policyCacheInitMap entry is cleared, and that the stuck reader is closed. Verified red
without the fix and green with it.
...TopicPoliciesServiceTest#testCleanPoliciesCacheInitMapCompletesPendingInitFuture: asserts that
dropping a pending init future (both the reader-close and non-reader-close branches) completes it
exceptionally and removes it from the map, and that an already-completed future is left untouched.
The full SystemTopicBasedTopicPoliciesServiceTest suite passes, including the existing
init/cleanup tests (cleanup call counts and behavior unchanged on the normal path).

Does this pull request potentially affect one of the following parts:

The default values of configurations (new topicPoliciesCacheInitTimeoutSeconds, default 60s; topic policies cache initialization is now bounded by default instead of unbounded)
The metrics (new counter pulsar.broker.topic.policies.cache.init.timeout.count)

…make a namespace's topics unloadable Topic loading waits for the namespace's topic policies cache to be initialized by reading the __change_events system topic to the end (SystemTopicBasedTopicPoliciesService#initPolicesCache), which completes a shared per-namespace future in policyCacheInitMap that every topic load awaits. That future could be left pending forever, leaving every topic in the namespace stuck and unloadable until the broker was restarted (issue apache#25294), in two ways: 1. The read loop had no timeout: a system-topic reader that reconnected but stopped making progress pinned the future indefinitely. 2. Several cleanup paths removed the future from policyCacheInitMap without ever completing it (most importantly the namespace-bundle unload path, removeOwnedNamespaceBundleAsync), relying on the reader being closed and the init chain failing to complete it indirectly. Modifications: - Add topicPoliciesCacheInitTimeoutSeconds (default 60s, dynamic). prepareInitPoliciesCacheAsync now schedules a timeout that fails the init future, and via an identity-guarded cleanup (cleanupAfterPolicyCacheInitTimeout) clears the cached state and closes the stuck reader only when the timed-out future is still the current one, so a concurrent retry/unload is never clobbered. A new metric pulsar.broker.topic.policies.cache.init.timeout.count counts these events. - cleanPoliciesCacheInitMap and close() now complete any pending init future they drop (exceptionally), so awaiting topic loads fail fast and retry instead of hanging. Completion happens outside the ConcurrentHashMap compute() remapping to avoid a recursive update / deadlock (apache#24977). Assisted-by: Claude Code (Opus 4.8)

void-ptr974

I found a possible race here. For example, init A times out and cleanupAfterPolicyCacheInitTimeout removes A by identity. Then a later topic load starts init B for the same namespace and installs a new future/reader. If reader A later fails or is closed, this old callback calls cleanPoliciesCacheInitMap(namespace, ...) by namespace, so it may remove/close init B's newer future/reader.

Maybe we can add an identity check before cleanup here to avoid cleaning up state from a newer init attempt.

…er failure can't clobber a retry Addresses review feedback on apache#26025. The timeout cleanup was identity-guarded, but the reader-chain exceptionally cleanup in prepareInitPoliciesCacheAsync still removed/closed state by namespace key. If init A timed out (its future already dropped by the identity-guarded timeout cleanup and its reader closed) and a retry installed a fresh future/reader B for the same namespace, A's late reader-failure callback called cleanPoliciesCacheInitMap(namespace, ...) by key and dropped B's future and closed B's reader. When B had already finished initializing and was tail-following __change_events, that left the namespace with no reader and no init entry, so topic policy updates stopped until init was re-triggered. Generalize the timeout's identity-guarded cleanup into cleanupFailedPolicyCacheInit(namespace, initFuture, closeReader) and route both the timeout path and both reader-chain exceptionally branches through it, so a callback that no longer owns the namespace is a no-op. closeReader=false preserves the previous transient-read-error semantics (drop the init future only, keep the reader cached for the retry to reuse). cleanPoliciesCacheInitMap (by namespace) is still used by the unload and closed paths. The dropped future is completed outside any ConcurrentHashMap remapping function, preserving the apache#24977 deadlock guard. Add testCleanupFailedPolicyCacheInitIsIdentityGuarded covering both the stale-future no-op and the owner-future teardown. Assisted-by: Claude Code (Opus 4.8)

lhotari · 2026-06-14T09:59:06Z

thanks @void-ptr974, pushed another commit to fix that

… the identity-guarded cleanup The two existing exception-path tests in SystemTopicBasedTopicPoliciesServiceTest counted cleanPoliciesCacheInitMap invocations. Routing the reader-chain exceptionally cleanup through the identity-guarded cleanupFailedPolicyCacheInit moved those calls off cleanPoliciesCacheInitMap, so the counts changed even though the cleanup still runs exactly once per failure trigger: - testPrepareInitPoliciesCacheAsyncThrowExceptionAfterCreateReader: cleanPoliciesCacheInitMap runs once (readMorePoliciesAsync on the closed reader) and cleanupFailedPolicyCacheInit once (the second prepareInitPoliciesCacheAsync failing in initPolicesCache) — was 2x cleanPoliciesCacheInitMap. - testPrepareInitPoliciesCacheAsyncThrowExceptionInCreateReader: reader creation fails, so cleanup runs once via cleanupFailedPolicyCacheInit and never via cleanPoliciesCacheInitMap — was 1x cleanPoliciesCacheInitMap. Assert the moved counts so the tests still pin "cleanup runs exactly once per trigger, no recursion". Assisted-by: Claude Code (Opus 4.8)

…ap-timeout-fix

void-ptr974

LGTM. I rechecked the cleanup path and the related tests pass locally. Thanks for the update.

…make a namespace's topics unloadable (#26025) (cherry picked from commit 24f2270)

lhotari · 2026-07-01T19:19:07Z

Follow up PR #26132 to address a remaining gap that came to the surface after this PR.

lhotari requested review from Technoboy-, dao-jun, merlimat and nodece June 13, 2026 16:32

lhotari force-pushed the lh-policyCacheInitMap-timeout-fix branch 2 times, most recently from c94b39b to 93fa3ba Compare June 13, 2026 17:03

lhotari changed the title ~~[fix][broker] Time out topic policies cache initialization when the __change_events reader is stuck~~ [fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable Jun 13, 2026

lhotari force-pushed the lh-policyCacheInitMap-timeout-fix branch from 93fa3ba to 33f4ddd Compare June 13, 2026 17:23

lhotari requested a review from BewareMyPower June 13, 2026 22:12

merlimat approved these changes Jun 14, 2026

View reviewed changes

void-ptr974 reviewed Jun 14, 2026

View reviewed changes

lhotari added 2 commits June 14, 2026 13:58

Merge remote-tracking branch 'origin/master' into lh-policyCacheInitM…

b4fd2ce

…ap-timeout-fix

void-ptr974 approved these changes Jun 14, 2026

View reviewed changes

lhotari merged commit 24f2270 into apache:master Jun 14, 2026
43 checks passed

lhotari added release/4.2.3 release/4.0.12 labels Jun 14, 2026

lhotari added a commit that referenced this pull request Jun 22, 2026

[fix][broker] Don't let a stuck or aborted topic policies cache init …

e2f14fd

…make a namespace's topics unloadable (#26025) (cherry picked from commit 24f2270)

lhotari added the cherry-picked/branch-4.2 label Jun 22, 2026

lhotari added a commit that referenced this pull request Jun 22, 2026

[fix][broker] Don't let a stuck or aborted topic policies cache init …

d42ac7a

…make a namespace's topics unloadable (#26025) (cherry picked from commit 24f2270)

lhotari added the cherry-picked/branch-4.0 label Jun 22, 2026

Technoboy- assigned lhotari Jun 30, 2026

This was referenced Jun 30, 2026

[Bug] [broker] topic policy fail to initialize result into unavailability of the entire cluster. #25116

Closed

[fix][broker] Don't let a closing topic-policies reader abort a concurrent cache-init reload #26132

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable#26025

[fix][broker] Don't let a stuck or aborted topic policies cache init make a namespace's topics unloadable#26025
lhotari merged 4 commits into
apache:masterfrom
lhotari:lh-policyCacheInitMap-timeout-fix

lhotari commented Jun 13, 2026 •

edited

Loading

Uh oh!

void-ptr974 left a comment

Uh oh!

lhotari commented Jun 14, 2026

Uh oh!

void-ptr974 left a comment

Uh oh!

Uh oh!

lhotari commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

lhotari commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Uh oh!

void-ptr974 left a comment

Choose a reason for hiding this comment

Uh oh!

lhotari commented Jun 14, 2026

Uh oh!

void-ptr974 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lhotari commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lhotari commented Jun 13, 2026 •

edited

Loading