⚠️ fix goroutine leak in cache#13487
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
This PR is currently missing an area label, which is used to identify the modified component when generating release notes. Area labels can be added by org members by writing Please see the labels list for possible areas. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
45abe5e to
50b15e6
Compare
Signed-off-by: sivchari <shibuuuu5@gmail.com>
50b15e6 to
568b11e
Compare
|
@sivchari: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
The proposed fix seems to be correct, but I wanted to raise a concern about the approach: this changes the signatures of several public APIs ( An alternative that avoids all of the public API changes would be to add a type Cache[E Entry] interface {
Add(entry E)
Has(key string) (E, bool)
Len() int
DeleteAll()
Close()
}
func New[E Entry](ttl time.Duration) Cache[E] {
done := make(chan struct{})
r := &cache[E]{
Store: kcache.NewTTLStore(func(obj any) (string, error) {
return obj.(E).Key(), nil
}, ttl),
done: done,
}
go func() {
for {
r.List()
select {
case <-done:
return
case <-time.After(expirationInterval):
}
}
}()
return r
}
type cache[E Entry] struct {
kcache.Store
done chan struct{}
closed sync.Once
}
func (r *cache[E]) Close() {
r.closed.Do(func() { close(r.done) })
}For the func (blder *Builder) Build(r reconcile.TypedReconciler[reconcile.Request]) (Controller, error) {
// ... existing setup ...
reconcileCache := cache.New[reconcileCacheEntry](cache.DefaultTTL)
// Clean up the cache goroutine when the manager stops.
_ = blder.mgr.Add(manager.RunnableFunc(func(ctx context.Context) error {
<-ctx.Done()
reconcileCache.Close()
return nil
}))
// ... rest of Build unchanged ...
}For controllers that create caches in func (r *Reconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, options controller.Options) error {
// ... existing builder chain ...
c, err := /* ... */.Build(r)
if err != nil {
return errors.Wrap(err, "failed setting up with a controller manager")
}
r.hookCache = cache.New[cache.HookEntry](cache.HookCacheDefaultTTL)
// Clean up when the manager stops.
_ = mgr.Add(manager.RunnableFunc(func(ctx context.Context) error {
<-ctx.Done()
r.hookCache.Close()
return nil
}))
// ... rest unchanged ...
}The same approach could work for One downside is that callers need to remember to call And it's potentially breaking for implementors of But maybe there are reasons to prefer the current PR to this approach. What do you think? |
|
Thanks for the detailed suggestion. The apidiff failure is expected, as the API change is intentional. Regarding the alternative Close() approach, adding Close() to the Cache interface would also be a breaking change for any external implementors, so neither approach avoids breaking compatibility on that front. Furthermore, with the ctx-based approach, goroutine cleanup happens automatically via context cancellation. In contrast, with Close(), every caller must remember to call it or wire up mgr.Add, and forgetting to do so would silently re-introduce the leak. For these reasons, I believe tying the goroutine lifecycle to the context is the safer design. |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #