Skip to content

✨[Feature] Duplicate Subgraph TRTEngine Detection, Caching, and Reuse #2674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #2684
gs-olive opened this issue Mar 6, 2024 · 1 comment
Closed
Tracked by #2684
Assignees
Labels
feature request New feature or request

Comments

@gs-olive
Copy link
Collaborator

gs-olive commented Mar 6, 2024

Context

In models with segmentation and fallback, it is oftentimes the case that the subgraphs therein have the same substructure, input shapes, and data types. In such cases, it is not necessary to rebuild and recompile the TRTEngine from scratch. We can instead cache and reuse existing compiled engines for the same subgraph.

Proposal

Automatically detect duplicate subgraphs, cache serialized engines from prior compilations in the same session, then clone and reuse those in subsequent compilations. Usage of weight-refitting is likely necessary here, as the key difference between subgraphs will be the weight values themselves.

@narendasan
Copy link
Collaborator

Implemented in 2.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants