Migrate to Neuron Runtime 2.X in model.py #93

tanmayv25 · 2021-11-09T23:28:04Z

No description provided.

inferentia/README.md

CoderHam · 2021-11-09T23:37:01Z

inferentia/README.md

+to/from the model in correct order irrespective of how they are
+specified in `config.pbtxt`. The indices should be consecutive
+integers starting from 0. Additionally, `--neuron-core-range`
+provides the neuron cores to be used while serving these models.


specifies the neuron ...

@tanmayv25 how does the user specify non contiguous neuron cores?

We don't support that because I believe it is not important... The data parallel operations need to be contiguous to derive best performance from caching..

inferentia/scripts/gen_triton_model.py

CoderHam · 2021-11-09T23:46:25Z

inferentia/scripts/gen_triton_model.py

-                dt, shape = self.output_dict[name]
-                output_tensor = pb_utils.Tensor(name,
-                                           merged_result.astype(pb_utils.triton_string_to_numpy(dt)))
+            for i in range(len(self.output_dict)):


There is an edge case where the outputs in model config can be ouput__0 and output__2 (1 is skipped). In this case range would not be a good idea.

Why not just use for i in self.output_dict? Similarly for input.
We do not care the order in which the outputs are read as long as they use the right index.

We weed out the edge case in the validation logic in __validate_input_dict /validate_output_dict..
Basically, we don't support skipping of indices..

We do not care the order in which the outputs are read as long as they use the right index.

I think that is true.. I will fix it..

Basically, we don't support skipping of indices..

Why do we want to skip it? The user should be allowed to ask for a subset of the outputs from the model, like with all other backends

We can not do this for inputs as the order does matter for placing the input in the list.

CoderHam · 2021-11-09T23:48:53Z

inferentia/scripts/gen_triton_model.py

-                        type=int,
-                        default=4,
-                        help='The number of available neuron cores')
+    parser.add_argument('--neuron_core_range',


We should set a default core range. Maybe just the first core?

I would want the user to be aware of this option and not silently introduce the limited executions..

GuanLuo · 2021-11-10T18:41:04Z

inferentia/scripts/gen_triton_model.py

+                config_input['name'], config_input['data_type'],
+                config_input['dims']
            ]
+            expected_input_count = expected_input_count + 1


GuanLuo · 2021-11-10T18:47:36Z

inferentia/scripts/gen_triton_model.py

-            self.input_dict[config_input['name']] = [
-                config_input['data_type'], config_input['dims']
+            index = self._validate_and_get_index(config_input['name'])
+            self.input_dict[index] = [


Do you need a dict if index is used as key? List seems to be sufficient.

We need to handle cases like INPUT__0 and INPUT__2, where INPUT__1 is missing. Using a dict helps in detecting missing indices with minimal iterations across input lists.

jbkyang-nvi · 2021-11-10T22:31:14Z

inferentia/README.md

+
+Additionally, `--neuron-core-range` specifies the neuron cores to
+be used while serving this models. Currently, only
+`torch.neuron.DataParallel()` mode is supported.


can you specify what that means? Something like:

Currently, the core used is specified using `torch.neuron.DataParallel()`

jbkyang-nvi

lgtm. Small nit

Migrate to Neuron Runtime 2.X in model.py

a5f2ae2

tanmayv25 requested review from CoderHam, GuanLuo, deadeyegoodwin and jbkyang-nvi November 9, 2021 23:28

CoderHam suggested changes Nov 9, 2021

View reviewed changes

Address review comments

31b9f46

tanmayv25 requested a review from CoderHam November 10, 2021 02:52

GuanLuo reviewed Nov 10, 2021

View reviewed changes

CoderHam previously approved these changes Nov 10, 2021

View reviewed changes

Addressing review comments

f8a022d

tanmayv25 dismissed CoderHam’s stale review via f8a022d November 10, 2021 21:10

tanmayv25 requested review from CoderHam and GuanLuo November 10, 2021 21:10

CoderHam previously approved these changes Nov 10, 2021

View reviewed changes

GuanLuo previously approved these changes Nov 10, 2021

View reviewed changes

jbkyang-nvi reviewed Nov 10, 2021

View reviewed changes

Add documentation for data parallel mode

e446860

tanmayv25 dismissed stale reviews from GuanLuo and CoderHam via e446860 November 10, 2021 22:36

tanmayv25 requested a review from jbkyang-nvi November 10, 2021 22:38

jbkyang-nvi approved these changes Nov 10, 2021

View reviewed changes

tanmayv25 merged commit c387f6a into main Nov 11, 2021

tanmayv25 deleted the tanmayv-neuron-2.x branch November 11, 2021 00:09

Migrate to Neuron Runtime 2.X in model.py #93

Migrate to Neuron Runtime 2.X in model.py #93

Uh oh!

Conversation

tanmayv25 commented Nov 9, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tanmayv25 Nov 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbkyang-nvi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

tanmayv25 Nov 10, 2021 •

edited

Loading