✨[Feature] Duplicate Subgraph TRTEngine
Detection, Caching, and Reuse
#2674
Labels
feature request
New feature or request
Context
In models with segmentation and fallback, it is oftentimes the case that the subgraphs therein have the same substructure, input shapes, and data types. In such cases, it is not necessary to rebuild and recompile the
TRTEngine
from scratch. We can instead cache and reuse existing compiled engines for the same subgraph.Proposal
Automatically detect duplicate subgraphs, cache serialized engines from prior compilations in the same session, then clone and reuse those in subsequent compilations. Usage of weight-refitting is likely necessary here, as the key difference between subgraphs will be the weight values themselves.
The text was updated successfully, but these errors were encountered: