-
Notifications
You must be signed in to change notification settings - Fork 232
Event skipping and temp resource caching based on resourceVersion comparison #2987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Conversation
Signed-off-by: Attila Mészáros <[email protected]>
this was added to fabric8 client meanwhile Signed-off-by: Attila Mészáros <[email protected]>
) * feature: add AggregatedMetrics to support multiple Metrics implementations Signed-off-by: David Sondermann <[email protected]>
) Co-authored-by: Chris Laprun <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Chris Laprun <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]> Co-authored-by: Martin Stefanko <[email protected]>
closes: #2984 Signed-off-by: Steve Hawkins <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
| > Long.parseLong(cachedResource.getMetadata().getResourceVersion()); | ||
| } catch (NumberFormatException e) { | ||
| log.debug( | ||
| log.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this even be an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, maybe even throw an exception at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the only problematic behavior could be poorly written mock servers that are providing invalid resourceVersions.
| var res = temporaryResourceCache.getResourceFromCache(resourceID); | ||
| if (res.isEmpty()) { | ||
| return isEventKnownFromAnnotation(newObject, oldObject); | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handling doesn't seem directly related to resourceVersion comparability. It's to handle the case where the event comes in before the create / update puts the result in the cache.
Do you want to not worry about this case now, or is there some other locking or other logic not in this pr handling this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this alone is not enough, but we might simplify it anyways, and just react on event that our last updated resource event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this a bit more? I'm not sure at the point in time if we have enough information to know if the incoming event is from our reconciliation, or another actor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an outdated version but the current one:
this implementation assumes that we want to trigger reconiliation only on newer version of resources, that came after our update. So basically we store the latest updated version even if we don't cache the resource in the temporal cache in case resource versions are equal. So new resources can be compared with this one. Essentially dropping events for our update and the updates happened before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify here's the timing sequence that the code is guarding against:
- The reconciliation starts to do an update of a resource at v1.
- Before the kubernetes client receives that result, the informer receives the event for the transition from v1 -> v2
- That event is later than the lastest known event to the TemporaryResourceCache because it has yet to transition to v2, so canSkipEvent will return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, yes, thank you pointing this out, might separate the PR and see if It is possible to handle that without the annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember the only two approaches are locking or the annotation. Locking is more intrusive in other parts of the code, whereas the current annotation approach is prone to reconciliation loops if SSA change detection isn't perfect. We could probably come up with a way to reason over a stable hash, but that starts seeming expensive for this problem.
Signed-off-by: Attila Mészáros <[email protected]>
| if (resource.isPresent()) { | ||
| return isLaterResourceVersion(newResource, resource.get()); | ||
| } | ||
| var latestUpdated = latestUpdatedVersion.get(resourceID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strategy can also be used instead of tombstones. If we attempt to put something that already has been deleted, the delete event (and the latest in general) is guarenteed to have been based upon a resourceVersion that is newer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have the time pls add a pr for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have the time pls add a pr for that.
Since that would be based upon the latest tracking, could you introduce / merge that (which will effectively shadow the latest tracked by the informer, but I'm not sure if you have anyone trying to use that directly) without the changes to the previous annotation? Or do you think you can complete that work in the near term?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this can just be something like #2989
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
3075f11 to
4972764
Compare
At this point it is safe to say that we can compare resource versions, that is simplifies lots of our algortihms in JOSDK. This PR simplifies the dependent resource related event skipping and caching parts.
Will clearly describe this in release notes, along with the removed config properties.