Skip to content

Conversation

@kthui
Copy link
Contributor

@kthui kthui commented May 22, 2024

Previous PR: triton-inference-server/server#7254
Related PR: triton-inference-server/server#7257

Modify the non-decoupled inference request to use the decoupled data pipeline, and add response sender support to non-decoupled models.

Next PR: #361

Copy link
Member

@Tabrizian Tabrizian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Nice refactor 🚀

@kthui kthui marked this pull request as ready for review May 23, 2024 05:39
@kthui kthui requested a review from Tabrizian May 23, 2024 05:48
@kthui kthui merged commit 01ba273 into jacky-res-sender-main May 31, 2024
@kthui kthui deleted the jacky-res-sender-unify branch May 31, 2024 22:11
kthui added a commit that referenced this pull request Jun 6, 2024
* Add response sender to non-decoupled models and unify data pipelines (#360)

* Add response sender to non-decoupled model and unify data pipelines

* Rename variable and class name

* Fix decoupled batch statistics to account for implicit batch size (#361)

* Fix decoupled gpu output error handling (#362)

* Fix decoupled gpu output error handling

* Return full error string upon exception from model

* Response sender to check for improper non-decoupled model usage (#363)

* Response sender to check for improper non-decoupled model usage

* Force close response sender on exception

* Rename functions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants