Add decoupled support for BLS #203

krishung5 · 2023-01-11T18:45:52Z

In this PR, ~~two APIs, InferenceRequest.stream_exec() and InferenceRequest.async_stream_exec() are added for BLS decoupled support.~~ two arguments, decoupled and execution_timeout are added to the original exec() and async_exec() funstion for BLS decoupled support. Here is the design doc for reference.

Under the hood, chained futures were used for retrieving responses from decoupled models (please refer to the design here). ~~hence, instead of using the generator, the futures will gather the responses and the responses will be returned as a list~~. A generator that contains all the responses will be returned.

Update: Currently, all the responses will be retrieved first and then transfer it to the user's Python model. For the next release, we would fix this issue and send the responses as they are being received.

For the timeout parameter mentioned in the design doc, the timeout in between two different responses from the generator will not be needed since the implementation of using chained futures handles the possible missing TRITONSERVER_RESPONSE_COMPLETE_FINAL flag leading to infinite loop situation,

Testing: triton-inference-server/server#5245

src/pb_stub.cc

CMakeLists.txt

src/python_be.cc

README.md

examples/bls_decoupled/README.md

src/infer_request.h

src/request_executor.cc

src/pb_stub.cc

src/infer_response.cc

src/infer_request.cc

… async_stream_exec()

krishung5 · 2023-02-01T21:10:22Z

@Tabrizian The above comments are addressed, please review. For the shm memory leak that I mentioned before, I'm still looking into it and will update here once it's fixed. Thanks!

README.md

Tabrizian · 2023-02-02T16:44:14Z

README.md

+forever.

- Currently, BLS can not run inference on a decoupled model.
+- BLS can not run inference on a decoupled model using functions


Let's remove these two since they are not a limitations of BLS.

Just to clarify that the limitations which should be removed do not include the one BLS can not run inference on a decoupled model in *async* decoupled mode, is this correct?

Sorry, we just need to remove the current bullet point. Perhaps we can reword the second bullet point to "Async BLS is not supported when running a Python model in decoupled mode".

Updated the limitation.

src/infer_response.h

src/pb_stub.cc

src/infer_response.cc

src/request_executor.cc

src/pb_stub.cc

README.md

src/request_executor.cc

README.md

krishung5 mentioned this pull request Jan 12, 2023

Add testing for BLS decoupled support triton-inference-server/server#5245

Merged

krishung5 force-pushed the krish-bls-decoupled branch from 79dda74 to b2c3ac9 Compare January 12, 2023 20:08

krishung5 marked this pull request as ready for review January 12, 2023 20:24

krishung5 commented Jan 12, 2023

View reviewed changes

src/pb_stub.cc Outdated Show resolved Hide resolved

krishung5 requested review from Tabrizian and tanmayv25 January 12, 2023 20:37

Tabrizian requested changes Jan 16, 2023

View reviewed changes

CMakeLists.txt Show resolved Hide resolved

src/python_be.cc Show resolved Hide resolved

krishung5 added 10 commits January 30, 2023 09:32

Add CMAKE_BUILD_TYPE flag to CMakeLists.txt

910ab98

Add decoupled support for BLS

5cb4e2a

Add execution timeout to the API

ccdc065

Update copyright

fbe5e79

Remove the wrong condition check for exec

c3a0397

Add examples

ee165b9

Use Release as default CMAKE_BUILD_TYPE

83a1bce

Rename variable

94789bf

Update example models

59945a1

Add documentation for BLS decoupled support

7ac20d1

Tabrizian reviewed Jan 30, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

Tabrizian reviewed Jan 31, 2023

View reviewed changes

krishung5 added 4 commits January 31, 2023 12:49

Returns generator from stream_exec function

108900b

Fix for completed response

32666af

Set futures in the constructor of InferResponse

fc0fa30

Use the server API to set timeout

89bc459

krishung5 force-pushed the krish-bls-decoupled branch from b2c3ac9 to 89bc459 Compare February 1, 2023 16:58

krishung5 added 2 commits February 1, 2023 09:48

Format

bfff3c9

Add 'decoupled' argument to exec() function. Remove stream_exec() and…

64a0843

… async_stream_exec()

krishung5 requested a review from Tabrizian February 1, 2023 21:10

Tabrizian reviewed Feb 2, 2023

View reviewed changes

src/pb_stub.cc Outdated Show resolved Hide resolved

krishung5 added 8 commits February 6, 2023 03:15

Address comments

2d9d10f

Rename 'execution_timeout' to 'timeout'

1d774c8

Remove unused variable and functions

473d19d

Make 'timeout' be part of the InferRequest constructor

b4b600c

Move class 'ResponseGenerator' to a new file

d31afb0

Fix up

a399542

Update document for 'timeout' changes

d0a0dcb

Remove the len() function for ResponseGenerator

2796309

krishung5 requested a review from Tabrizian February 6, 2023 20:08

Tabrizian reviewed Feb 7, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Show resolved Hide resolved

src/request_executor.cc Outdated Show resolved Hide resolved

krishung5 added 2 commits February 7, 2023 11:54

Remove promise from InferRequest object

57e8fa5

Wording

1d04487

krishung5 commented Feb 7, 2023

View reviewed changes

README.md Show resolved Hide resolved

Fix up

7df2da3

krishung5 requested a review from Tabrizian February 7, 2023 20:07

Tabrizian previously approved these changes Feb 7, 2023

View reviewed changes

krishung5 added 3 commits February 7, 2023 16:48

Address comment

9c18042

Fix up

193661d

Change the release version

b6df7a8

krishung5 dismissed Tabrizian’s stale review via b6df7a8 February 8, 2023 22:55

krishung5 requested a review from Tabrizian February 8, 2023 22:56

Tabrizian approved these changes Feb 10, 2023

View reviewed changes

krishung5 merged commit 0aef2c4 into main Feb 11, 2023

krishung5 deleted the krish-bls-decoupled branch February 11, 2023 05:30

Add decoupled support for BLS #203

Add decoupled support for BLS #203

Uh oh!

Conversation

krishung5 commented Jan 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

krishung5 commented Feb 1, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tabrizian Feb 2, 2023

Choose a reason for hiding this comment

Uh oh!

krishung5 Feb 6, 2023

Choose a reason for hiding this comment

Uh oh!

Tabrizian Feb 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krishung5 Feb 8, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

krishung5 commented Jan 11, 2023 •

edited

Loading

Tabrizian Feb 7, 2023 •

edited

Loading