Enable parallel instance loading backend attribute #284
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As far as I can tell, the Python Backend is completely thread-safe on calls to TRITONBACKEND_ModelInstanceInitialize
All use of
model_state(the problem area with instance initialization in current backends) appears to be read-only.It is passed to constructor for ModelInstanceState very simply.
L0_backend_python has been hanging on BLS model load tests, but this is now happening on
mainbranches too, so I don't believe this is due to any parallel instance changes.Corresponding tests: triton-inference-server/server#6126