Skip to content

Commit c48d114

Browse files
authored
Clarify FORCE_CPU_ONLY_INPUT_TENSORS documentation (triton-inference-server#87)
* Clarify FORCE_CPU_ONLY_INPUT_TENSORS documentation * Review edit
1 parent 18c37ad commit c48d114

File tree

1 file changed

+11
-10
lines changed

1 file changed

+11
-10
lines changed

README.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -698,16 +698,17 @@ is not C-order contiguous an exception will be raised.
698698

699699
This function can be used to check whether a tensor is placed in CPU or not.
700700

701-
## Controlling Input Tensor Device Placement
702-
703-
By default Python backend moves all the input tensors to CPU. Starting from
704-
21.09 release, you can control whether you want to move input tensors to CPU or
705-
let Triton decide the placement of the input tensors. If you let Triton decide
706-
the placement of input tensors, your Python model must be able to handle tensors
707-
that are in CPU or GPU. You can control this using the
708-
`FORCE_CPU_ONLY_INPUT_TENSORS` setting in your Python model configuration. The
709-
default value for this parameter is "yes". By adding the line below to your
710-
model config, you are letting Triton decide the placement of input Tensors:
701+
## Input Tensor Device Placement
702+
703+
By default, the Python backend moves all input tensors to CPU before providing
704+
them to the Python model. Starting from 21.09, you can change this default
705+
behavior. By setting `FORCE_CPU_ONLY_INPUT_TENSORS` to "no", Triton will not
706+
move input tensors to CPU for the Python model. Instead, Triton will provide the
707+
input tensors to the Python model in either CPU or GPU memory, depending on how
708+
those tensors were last used. You cannot predict which memory will be used for
709+
each input tensor so your Python model must be able to handle tensors in both
710+
CPU and GPU memory. To enable this setting, you need to add this setting to the
711+
`parameters` section of model configuration:
711712

712713
```
713714
parameters: { key: "FORCE_CPU_ONLY_INPUT_TENSORS" value: {string_value:"no"}}

0 commit comments

Comments
 (0)