Skip to content

Commit 2489c65

Browse files
authored
Add testing setup script and some more notes to README (triton-inference-server#96)
- Add test.sh setup and run script to inferentia/qa folder. Corresponding server PR: Add tests to inferentia server#3586 - Add to README: 1) pointers to how to compile a model 2) instructions on how to run test
1 parent d59c01e commit 2489c65

File tree

5 files changed

+252
-12
lines changed

5 files changed

+252
-12
lines changed

inferentia/README.md

Lines changed: 49 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,17 @@ Starting from 21.11 release, Triton supports
3232
[AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/)
3333
and the [Neuron Runtime](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-intro/get-started.html).
3434

35+
## Table of Contents
36+
37+
- [Using Triton with Inferentia](#using-triton-with-inferentia)
38+
- [Table of Contents](#table-of-contents)
39+
- [Inferentia setup](#inferentia-setup)
40+
- [Setting up the Inferentia model](#setting-up-the-inferentia-model)
41+
- [PyTorch](#pytorch)
42+
- [TensorFlow](#tensorflow)
43+
- [Serving Inferentia model in Triton](#serving-inferentia-model-in-triton)
44+
- [Testing Inferentia Setup for Accuracy](#testing-inferentia-setup-for-accuracy)
45+
3546
## Inferentia setup
3647

3748
First step of running Triton with Inferentia is to create an AWS Inferentia
@@ -50,17 +61,18 @@ Clone this repo with Github to home repo `/home/ubuntu`.
5061
Ensure that the neuron runtime 1.0 demon (neuron-rtd) is not running and set up
5162
and install neuron 2.X runtime builds with
5263
```
53-
sudo ./python_backend/setup-pre-container.sh
64+
$chmod 777 /home/ubuntu/python_backend/inferentia/scripts/setup-pre-container.sh
65+
$sudo /home/ubuntu/python_backend/inferentia/scripts/setup-pre-container.sh
5466
```
5567

5668
Then, start the Triton instance with:
5769
```
58-
docker run --device /dev/neuron0 <more neuron devices> -v /home/ubuntu/python_backend:/home/ubuntu/python_backend -v /lib/udev:/mylib/udev --shm-size=1g -e "AWS_NEURON_VISIBLE_DEVICES=ALL" --ulimit memlock=-1 -p 8000:8000 -p 8001:8001 -p 8002:8002 --ulimit stack=67108864 -ti nvcr.io/nvidia/tritonserver:<xx.yy>-py3
70+
$docker run --device /dev/neuron0 <more neuron devices> -v /home/ubuntu/python_backend:/home/ubuntu/python_backend -v /lib/udev:/mylib/udev --shm-size=1g --ulimit memlock=-1 -p 8000:8000 -p 8001:8001 -p 8002:8002 --ulimit stack=67108864 -ti nvcr.io/nvidia/tritonserver:<xx.yy>-py3
5971
```
6072
Note 1: The user would need to list any neuron device to run during container initialization.
6173
For example, to use 4 neuron devices on an instance, the user would need to run with:
6274
```
63-
docker run --device /dev/neuron0 --device /dev/neuron1 --device /dev/neuron2 --device /dev/neuron3 ...`
75+
$docker run --device /dev/neuron0 --device /dev/neuron1 --device /dev/neuron2 --device /dev/neuron3 ...`
6476
```
6577
Note 2: `/mylib/udev` is used for Neuron parameter passing.
6678

@@ -70,7 +82,7 @@ Note 3: For Triton container version xx.yy, please refer to
7082

7183
After starting the Triton container, go into the `python_backend` folder and run the setup script.
7284
```
73-
source /home/ubuntu/python_backend/inferentia/scripts/setup .sh
85+
$source /home/ubuntu/python_backend/inferentia/scripts/setup.sh
7486
```
7587
This script will:
7688
1. Setup miniconda enviroment
@@ -84,7 +96,7 @@ There are user configurable options available for the script as well.
8496
For example, to control the python version for the python environment to 3.6,
8597
you can run:
8698
```
87-
source /home/ubuntu/python_backend/inferentia/scripts/setup.sh -v 3.6
99+
$source /home/ubuntu/python_backend/inferentia/scripts/setup.sh -v 3.6
88100
```
89101
Please use the `-h` or `--help` options to learn about more configurable options.
90102

@@ -94,6 +106,15 @@ Currently, we only support [PyTorch](https://awsdocs-neuron.readthedocs-hosted.c
94106
and [TensorFlow](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/tensorflow-neuron/index.html)
95107
workflows for execution on inferentia.
96108

109+
The user is required to create their own `*.pt` (for pytorch) or `*.savedmodels` (for tensorflow) models. This is
110+
a critical step since Inferentia will need the underlying `.NEFF` graph to execute
111+
the inference request. Please refer to:
112+
- [Neuron compiler CLI Reference Guide](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/command-line-reference.html)
113+
- [PyTorch-Neuron trace python API](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/api-compilation-python-api.html)
114+
- [PyTorch Tutorials](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/tutorials/index.html)
115+
- [TensorFlow Tutorials](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/tensorflow-neuron/tutorials/index.html)
116+
117+
for guidance on how to compile models.
97118
### PyTorch
98119

99120
For PyTorch, we support models traced by [PyTorch-Neuron trace python API](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/pytorch-neuron/api-compilation-python-api.html)
@@ -195,11 +216,32 @@ a valid torchscript file or tensorflow savedmodel.
195216
Now, the server can be launched with the model as below:
196217

197218
```
198-
tritonserver --model-repository <path_to_model_repository>
219+
$tritonserver --model-repository <path_to_model_repository>
199220
```
200221

201222
Note:
202223
1. The `config.pbtxt` and `model.py` should be treated as
203224
starting point. The users can customize these files as per
204225
their need.
205-
2. Triton Inferentia is currently tested with a **single** model.
226+
2. Triton Inferentia is currently tested with a **single** model.
227+
228+
## Testing Inferentia Setup for Accuracy
229+
The [qa folder](https://github.com/triton-inference-server/python_backend/tree/main/inferentia/qa)
230+
contains the necessary files to set up testing with a simple add_sub model. The test
231+
requires an instance with more than 8 inferentia cores to run, eg:`inf1.6xlarge`.
232+
start the test, run
233+
```
234+
$source <triton path>/python_backend/inferentia/qa/setup_test_enviroment_and_test.sh
235+
```
236+
where `<triton path>` is usually `/home/ubuntu`/.
237+
This script will pull the [server repo](https://github.com/triton-inference-server/server)
238+
that contains the tests for inferentia. It will then build the most recent
239+
Triton Server and Triton SDK.
240+
241+
Note: If you would need to change some of the tests in the server repo,
242+
you would need to run
243+
```
244+
$export TRITON_SERVER_BRANCH_NAME=<your branch name>
245+
```
246+
before running the script.
247+

inferentia/qa/Dockerfile.QA

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
#
28+
# Multistage build.
29+
#
30+
ARG BASE_IMAGE=tritonserver
31+
ARG BUILD_IMAGE=tritonserver_build
32+
ARG SDK_IMAGE=tritonserver_sdk
33+
ARG TRITON_PATH=/home/ubuntu
34+
35+
FROM ${SDK_IMAGE} AS sdk
36+
FROM $BASE_IMAGE
37+
# Ensure apt-get won't prompt for selecting options
38+
ENV DEBIAN_FRONTEND=noninteractive
39+
# install platform specific packages
40+
RUN if [ $(cat /etc/os-release | grep 'VERSION_ID="20.04"' | wc -l) -ne 0 ]; then \
41+
apt-get update && \
42+
apt-get install -y --no-install-recommends \
43+
libpng-dev; \
44+
elif [ $(cat /etc/os-release | grep 'VERSION_ID="18.04"' | wc -l) -ne 0 ]; then \
45+
apt-get update && \
46+
apt-get install -y --no-install-recommends \
47+
libpng-dev; \
48+
else \
49+
echo "Ubuntu version must be either 18.04 or 20.04" && \
50+
exit 1; \
51+
fi
52+
53+
RUN apt-get update && apt-get install -y --no-install-recommends \
54+
python3-dev \
55+
python3-pip \
56+
build-essential \
57+
wget && \
58+
rm -rf /var/lib/apt/lists/*
59+
60+
RUN rm -f /usr/bin/python && \
61+
ln -s /usr/bin/python3 /usr/bin/python
62+
63+
RUN pip3 install --upgrade wheel setuptools && \
64+
pip3 install --upgrade numpy pillow attrdict future grpcio requests gsutil awscli six grpcio-channelz
65+
66+
WORKDIR /opt/tritonserver
67+
# Copy the entire qa repo to the /opt/tritonserver/qa repo
68+
COPY --from=tritonserver_build /workspace/qa qa
69+
COPY --chown=1000:1000 --from=sdk /workspace/install client_tmp
70+
RUN mkdir -p qa/clients && mkdir -p qa/pkgs && \
71+
cp -a client_tmp/bin/* qa/clients/. && \
72+
cp client_tmp/lib/libgrpcclient.so qa/clients/. && \
73+
cp client_tmp/lib/libhttpclient.so qa/clients/. && \
74+
cp client_tmp/python/*.py qa/clients/. && \
75+
cp client_tmp/python/triton*.whl qa/pkgs/. && \
76+
cp client_tmp/java/examples/*.jar qa/clients/. && \
77+
rm -rf client_tmp
78+
# Create mount paths for lib
79+
RUN mkdir /mylib && mkdir /home/ubuntu
80+
81+
ENV TRITON_PATH ${TRITON_PATH}
82+
ENV LD_LIBRARY_PATH /opt/tritonserver/qa/clients:${LD_LIBRARY_PATH}
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
#!/bin/bash
2+
# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions
6+
# are met:
7+
# * Redistributions of source code must retain the above copyright
8+
# notice, this list of conditions and the following disclaimer.
9+
# * Redistributions in binary form must reproduce the above copyright
10+
# notice, this list of conditions and the following disclaimer in the
11+
# documentation and/or other materials provided with the distribution.
12+
# * Neither the name of NVIDIA CORPORATION nor the names of its
13+
# contributors may be used to endorse or promote products derived
14+
# from this software without specific prior written permission.
15+
#
16+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
17+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
19+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
20+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
21+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
22+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
23+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
24+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27+
28+
export TRITON_PATH="/home/ubuntu"
29+
export DEFAULT_REPO_TAG="main"
30+
export TRITON_COMMON_REPO_TAG=${DEFAULT_REPO_TAG}
31+
export TRITON_CORE_REPO_TAG=${DEFAULT_REPO_TAG}
32+
export TRITON_BACKEND_REPO_TAG=${DEFAULT_REPO_TAG}
33+
export TRITON_THIRD_PARTY_REPO_TAG=${DEFAULT_REPO_TAG}
34+
export IDENTITY_BACKEND_REPO_TAG=${DEFAULT_REPO_TAG}
35+
export PYTHON_BACKEND_REPO_TAG=${DEFAULT_REPO_TAG}
36+
export CHECKSUM_REPOAGENT_REPO_TAG=${DEFAULT_REPO_TAG}
37+
export TRITON_SERVER_BRANCH_NAME=${TRITON_SERVER_BRANCH_NAME:=${DEFAULT_REPO_TAG}}
38+
export TRITON_CLIENT_REPO_TAG=${TRITON_CLIENT_REPO_TAG:=${DEFAULT_REPO_TAG}}
39+
export TRITON_VERSION="2.17.0dev"
40+
export TRITON_CONTAINER_VERSION="21.12dev"
41+
export TRITON_UPSTREAM_CONTAINER_VERSION="21.10"
42+
export BASE_IMAGE=tritonserver
43+
export SDK_IMAGE=tritonserver_sdk
44+
export BUILD_IMAGE=tritonserver_build
45+
export QA_IMAGE=tritonserver_qa
46+
47+
cd ${TRITON_PATH}
48+
# Clone necessary branches
49+
rm -rf ${TRITON_PATH}/server
50+
git clone --single-branch --depth=1 -b ${TRITON_SERVER_BRANCH_NAME} \
51+
https://github.com/triton-inference-server/server.git
52+
echo ${TRITON_VERSION} > server/TRITON_VERSION
53+
cd ${TRITON_PATH}/server
54+
git clone --single-branch --depth=1 -b ${TRITON_CLIENT_REPO_TAG} \
55+
https://github.com/triton-inference-server/client.git clientrepo
56+
57+
# First set up inferentia and run in detatched mode
58+
cd ${TRITON_PATH}/python_backend
59+
chmod 777 ${TRITON_PATH}/python_backend/inferentia/scripts/setup-pre-container.sh
60+
sudo ${TRITON_PATH}/python_backend/inferentia/scripts/setup-pre-container.sh
61+
62+
# Build container with only python backend
63+
cd ${TRITON_PATH}/server
64+
pip3 install docker
65+
./build.py --build-dir=/tmp/tritonbuild \
66+
--cmake-dir=${TRITON_PATH}/server/build \
67+
--version=${TRITON_VERSION} \
68+
--container-version=${TRITON_CONTAINER_VERSION} \
69+
--enable-logging --enable-stats --enable-tracing \
70+
--enable-metrics --enable-gpu-metrics --enable-gpu \
71+
--filesystem=gcs --filesystem=azure_storage --filesystem=s3 \
72+
--endpoint=http --endpoint=grpc \
73+
--repo-tag=common:${TRITON_COMMON_REPO_TAG} \
74+
--repo-tag=core:${TRITON_CORE_REPO_TAG} \
75+
--repo-tag=backend:${TRITON_BACKEND_REPO_TAG} \
76+
--repo-tag=thirdparty:${TRITON_THIRD_PARTY_REPO_TAG} \
77+
--backend=identity:${IDENTITY_BACKEND_REPO_TAG} \
78+
--backend=python:${PYTHON_BACKEND_REPO_TAG} \
79+
--repoagent=checksum:${CHECKSUM_REPOAGENT_REPO_TAG}
80+
docker tag tritonserver_build "${BUILD_IMAGE}"
81+
docker tag tritonserver "${BASE_IMAGE}"
82+
83+
# Build docker container for SDK
84+
docker build -t ${SDK_IMAGE} \
85+
-f ${TRITON_PATH}/server/Dockerfile.sdk \
86+
--build-arg "TRITON_CLIENT_REPO_SUBDIR=clientrepo" .
87+
88+
# Build QA container
89+
docker build -t ${QA_IMAGE} \
90+
-f ${TRITON_PATH}/python_backend/inferentia/qa/Dockerfile.QA \
91+
--build-arg "TRITON_PATH=${TRITON_PATH}" \
92+
--build-arg "BASE_IMAGE=${BASE_IMAGE}" \
93+
--build-arg "BUILD_IMAGE=${BUILD_IMAGE}" \
94+
--build-arg "SDK_IMAGE=${SDK_IMAGE}" .
95+
96+
export TEST_JSON_REPO=/opt/tritonserver/qa/common/inferentia_perf_analyzer_input_data_json
97+
export TEST_REPO=/opt/tritonserver/qa/L0_inferentia_perf_analyzer
98+
export TEST_SCRIPT="test.sh"
99+
100+
# Run single instance test
101+
CONTAINER_NAME="qa_container"
102+
docker stop ${CONTAINER_NAME} && docker rm ${CONTAINER_NAME}
103+
docker create --name ${CONTAINER_NAME} \
104+
--device /dev/neuron0 \
105+
--device /dev/neuron1 \
106+
--shm-size=1g --ulimit memlock=-1 \
107+
-p 8000:8000 -p 8001:8001 -p 8002:8002 \
108+
--ulimit stack=67108864 \
109+
-e TEST_REPO=${TEST_REPO} \
110+
-e TEST_JSON_REPO=${TEST_JSON_REPO} \
111+
-e TRITON_PATH=${TRITON_PATH} \
112+
--net host -ti ${QA_IMAGE} \
113+
/bin/bash -c "bash -ex ${TEST_REPO}/${TEST_SCRIPT}" && \
114+
docker cp /lib/udev ${CONTAINER_NAME}:/mylib/udev && \
115+
docker cp /home/ubuntu/python_backend ${CONTAINER_NAME}:${TRITON_PATH}/python_backend && \
116+
docker start -a ${CONTAINER_NAME} || RV=$?;

inferentia/scripts/setup-pre-container.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#/bin/bash
1+
#!/bin/bash
22
# Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
@@ -28,16 +28,16 @@ cd /home/ubuntu
2828

2929
# First stop and remove old neuron 1.X runtime
3030
sudo systemctl stop neuron-rtd
31-
sudo apt remove aws-neuron-runtime
31+
sudo apt remove aws-neuron-runtime -y
3232

3333
# Then install new neuron libraries
3434
. /etc/os-release
3535
sudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <<EOF
3636
deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main
3737
EOF
3838
sudo wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add -
39-
sudo apt-get update
40-
sudo apt-get install -y \
39+
sudo apt-get update && \
40+
sudo apt-get install -y \
4141
linux-headers-$(uname -r) \
4242
aws-neuron-dkms \
4343
aws-neuron-tools

inferentia/scripts/setup.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Sets up python execution environment for AWS Neuron SDK for execution on Inferen
4141
OPTS=$(getopt -o hb:v:i:tp --long help,python-backend-path:,python-version:,inferentia-path:,use-tensorflow,use-pytorch -- "$@")
4242

4343

44-
export INFRENTIA_PATH="/home/ubuntu"
44+
export INFRENTIA_PATH=${TRITON_PATH:="/home/ubuntu"}
4545
export PYTHON_BACKEND_PATH="/home/ubuntu/python_backend"
4646
export PYTHON_VERSION=3.7
4747
export USE_PYTORCH=0

0 commit comments

Comments
 (0)