-
Notifications
You must be signed in to change notification settings - Fork 184
Add documentation for decoupled support #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please also add a link to the decoupled examples in the examples section?
README.md
Outdated
| - [Usage](#usage) | ||
| - [`initialize`](#initialize) | ||
| - [`execute`](#execute) | ||
| - [Non-Decoupled mode](#non-decoupled-mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming Non-Decoupled mode to Default Mode?
README.md
Outdated
| function. The mode you choose should depend on your use case. That is whether | ||
| or not you want to return decoupled responses from this model or not. | ||
|
|
||
| #### Non-Decoupled mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default Mode
README.md
Outdated
| InferenceRequest objects passed to the function are deleted, and so | ||
| InferenceRequest objects should not be retained by the Python model. | ||
|
|
||
| In case one of the inputs has an error, you can use the `TritonError` object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inputs -> requests
README.md
Outdated
| object, use InferenceResponseSender.send() to send response with the | ||
| error back to the user. | ||
|
|
||
| ##### Special Cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Special Cases -> Example Use Cases
| ``` | ||
|
|
||
|
|
||
| #### Decoupled mode \[Beta\] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention that in order to use this mode they should set the transaction policy to decoupled.
README.md
Outdated
|
|
||
| ##### Special Cases | ||
|
|
||
| The decoupled mode is powerful and supports various special cases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
special cases -> use cases
README.md
Outdated
| The support for decoupled models is still in beta and suffers | ||
| from below known issues: | ||
|
|
||
| * The decoupled mode does not support [GPU tensors](#interoperability-and-gpu-support). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The decoupled mode doesn't support FORCE_CPU_ONLY_INPUT_TENSORS parameter to be turned off. This means that the input tensors will always be in CPU."
On a separate bullet point:
"Currently, the InferenceResponseSender.send method only supports inference_response objects that contain only CPU tensors."
README.md
Outdated
| from below known issues: | ||
|
|
||
| * The decoupled mode does not support [GPU tensors](#interoperability-and-gpu-support). | ||
| * Inferences on a decoupled model can not be run within [Business Logic Scripting](#business-logic-scripting). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we mention this limitation in the BLS limitations? It might confuse the users that they can't run BLS requests in the decoupled API mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mention these limitation in both the places. I will try to clarify here.
You are right. Logically it is more of a BLS limitation.
|
@tanmayv25 , @Tabrizian could you please wrap it up |
* Add documentation for decoupled support * Improve the documentation
No description provided.