Skip to content

Conversation

@csviri
Copy link
Collaborator

@csviri csviri commented Oct 10, 2025

At this point it is safe to say that we can compare resource versions, that is simplifies lots of our algortihms in JOSDK. This PR simplifies the dependent resource related event skipping and caching parts.

Will clearly describe this in release notes, along with the removed config properties.

csviri and others added 15 commits October 7, 2025 12:47
this was added to fabric8 client meanwhile

Signed-off-by: Attila Mészáros <[email protected]>
)

* feature: add AggregatedMetrics to support multiple Metrics implementations


Signed-off-by: David Sondermann <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Co-authored-by: Martin Stefanko <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
@csviri csviri linked an issue Oct 10, 2025 that may be closed by this pull request
3 tasks
@csviri csviri changed the title Comparing resource version dr Event skipping and temp resource caching based on resourceVersion comparison Oct 10, 2025
> Long.parseLong(cachedResource.getMetadata().getResourceVersion());
} catch (NumberFormatException e) {
log.debug(
log.warn(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this even be an error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe even throw an exception at this point?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the only problematic behavior could be poorly written mock servers that are providing invalid resourceVersions.

var res = temporaryResourceCache.getResourceFromCache(resourceID);
if (res.isEmpty()) {
return isEventKnownFromAnnotation(newObject, oldObject);
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handling doesn't seem directly related to resourceVersion comparability. It's to handle the case where the event comes in before the create / update puts the result in the cache.

Do you want to not worry about this case now, or is there some other locking or other logic not in this pr handling this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this alone is not enough, but we might simplify it anyways, and just react on event that our last updated resource event.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this a bit more? I'm not sure at the point in time if we have enough information to know if the incoming event is from our reconciliation, or another actor.

Copy link
Collaborator Author

@csviri csviri Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an outdated version but the current one:
this implementation assumes that we want to trigger reconiliation only on newer version of resources, that came after our update. So basically we store the latest updated version even if we don't cache the resource in the temporal cache in case resource versions are equal. So new resources can be compared with this one. Essentially dropping events for our update and the updates happened before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify here's the timing sequence that the code is guarding against:

  • The reconciliation starts to do an update of a resource at v1.
  • Before the kubernetes client receives that result, the informer receives the event for the transition from v1 -> v2
  • That event is later than the lastest known event to the TemporaryResourceCache because it has yet to transition to v2, so canSkipEvent will return false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, yes, thank you pointing this out, might separate the PR and see if It is possible to handle that without the annotation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember the only two approaches are locking or the annotation. Locking is more intrusive in other parts of the code, whereas the current annotation approach is prone to reconciliation loops if SSA change detection isn't perfect. We could probably come up with a way to reason over a stable hash, but that starts seeming expensive for this problem.

@csviri csviri marked this pull request as draft October 10, 2025 12:47
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 10, 2025
Signed-off-by: Attila Mészáros <[email protected]>
if (resource.isPresent()) {
return isLaterResourceVersion(newResource, resource.get());
}
var latestUpdated = latestUpdatedVersion.get(resourceID);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strategy can also be used instead of tombstones. If we attempt to put something that already has been deleted, the delete event (and the latest in general) is guarenteed to have been based upon a resourceVersion that is newer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have the time pls add a pr for that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have the time pls add a pr for that.

Since that would be based upon the latest tracking, could you introduce / merge that (which will effectively shadow the latest tracked by the informer, but I'm not sure if you have anyone trying to use that directly) without the changes to the previous annotation? Or do you think you can complete that work in the near term?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think this can just be something like #2989

Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2025
@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@csviri csviri force-pushed the next branch 2 times, most recently from 3075f11 to 4972764 Compare October 23, 2025 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comparable Resource Versions in Kubernetes

6 participants