VMware Modern Apps Community
DCasota
Expert
Expert
Jump to solution

NVidia Triton on Photon OS

Hi everyone,

 

How it started
I'm trying to get some traction on AI inference use cases on Photon OS.
An idea was to get baken a super easy .ova that gives the cabability to stream e.g. from a file or to livestream an iPhone camera source and to see the visualized inference videostream on a web url. This could include vehicle and people detection and tracking.

 

How it goes
The deepstream tool from the inference server, an open-source-components-assembled NVidia product called Triton, does not start successfully on Photon OS and I'm not convinced that it is Photon OS issue.

Clarification from the community is the best.

 

Here the issue description

 

Starting deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt ends with

**** INFO: <bus_callback:194>: Pipeline ready**
**Error String : Feature not supported on this GPUError Code : 801**
**ERROR from nvv4l2decoder0: Failed to process frame.**
Debug info: gstv4l2videodec.c(1747): gst_v4l2_video_dec_handle_frame (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/nvv4l2decoder:nvv4l2decoder0:
**Maybe be due to not enough memory or failing driver**
**ERROR from qtdemux0: Internal data stream error.**
Debug info: qtdemux.c(6605): gst_qtdemux_loop (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/GstQTDemux:qtdemux0:
**streaming stopped, reason error (-5)**
Quitting
[NvMultiObjectTracker] De-initialized
App run failed

 

Are the errors "Error String : Feature not supported on this GPUError Code : 801" and "ERROR from nvv4l2decoder0: Failed to process frame" Photon OS related (hardware + headless) ?

 

Here's the setup description.

 

Step 1: Hardware / Software

  • No blame me, it is an Azure Standard_NV6 virtual machine (6vCPUs, 56GB RAM, with 1x NVidia Tesla M60 GPU) because I cannot afford yet a new onpremise vSphere 8 homelab. Photon OS runs best on vSphere, no doubt.
  • 64 GB disk space. The docker container is consuming a lot of disk space.
  • VMware Photon OS 3 rev 2 as operating system. Photon OS runs with no X window (headless), system name is NVidia01.
  • installed NVidia-Container-Toolkit and pulled Deepstream:6.1-Triton docker container (see Step3++)

 

Step 2: Base provisioning (one reboot)
# repo url
if [ `cat /etc/yum.repos.d/photon.repo | grep -o "packages.vmware.com/photon" | wc -l` -eq 0 ]; then
cd /etc/yum.repos.d/
sudo sed -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo
fi

# update components with impact to nvidia components
tdnf update -y docker

# install kernel api headers and devel
tdnf install -y build-essential wget tar
# On vSphere comment the following line
tdnf install -y linux-devel
# On vSphere uncomment the following line
# tdnf install -y linux-esx-devel


reboot

wget https://us.download.nvidia.com/tesla/470.141.03/NVIDIA-Linux-x86_64-470.141.03.run
chmod a+x ./NVIDIA-Linux-x86_64-470.141.03.run
./NVIDIA-Linux-x86_64-470.141.03.run --kernel-source-path=/usr/lib/modules/`uname -r`/build --ui=none --no-questions --accept-license

# Check GPU driver
root@NVidia01 [ / ]# nvidia-smi
Wed Aug 24 07:02:55 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03 Driver Version: 470.141.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 00003130:00:00.0 Off | Off |
| N/A 34C P0 36W / 150W | 0MiB / 8129MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

 

Step 3: install NVidia Container toolkit to run NVidia docker container on Photon OS
tdnf install -y gpg
cd /etc/pki/rpm-gpg/
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /etc/pki/rpm-gpg/nvidia-container-toolkit-keyring.gpg

cat << EOF >>/etc/yum.repos.d/nvidia-container-toolkit.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/centos7/x86_64
gpgcheck=0
enabled=1
EOF

tdnf makecache
tdnf install nvidia-container-toolkit

systemctl restart docker

rm /etc/yum.repos.d/nvidia-container-toolkit.repo

 

Step 4: Open some ports
iptables -A INPUT -i eth0 -p udp --dport 5400 -j ACCEPT
iptables -A OUTPUT -p udp --dport 5400 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8000 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8000 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8001 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8001 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8002 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8002 -j ACCEPT
iptables-save >/etc/systemd/scripts/ip4save

 

Step 5: Pull docker container and start it
docker run --gpus all -it --rm --net=host nvcr.io/nvidia/deepstream:6.1-triton -p 8000:8000/tcp -p 8001:8001/tcp -p 8002:8002/tcp -p 5400:5400/udp

 

Step 6: Inside docker container: Download configuration files and models
git clone https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps.git
cd /opt/nvidia/deepstream/deepstream-6.1/deepstream_reference_apps/deepstream_app_tao_configs
cp -a * /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
apt-get install -y wget zip
cd /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
./download_models.sh

 

Step 7: Configure file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
In deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt the group [sink0] has been deactivated (enable=0), and [sink2] has been activated (enable=2). Additionally in [sink2], codec=2 #2=h265 has been set because of [source0] uri=file://../../streams/sample_1080p_h265.mp4.

 

Step 8: Start the deepstream app
deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt

 

Labels (4)
0 Kudos
1 Solution

Accepted Solutions
DCasota
Expert
Expert
Jump to solution

With help from the NVidia user forum it works now on Photon OS.

https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new lists each GPU with its capabilities. In order to use the hardware decoding of the M60 GPU, a stream or a file in h264 format must be available. h265 is not supported.

 

Hence, in Step7 the sample file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt needs modifications.

 

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h264.mp4
gpu-id=0

[sink2] # renamed from sink0
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0

[sink0] # renamed from sink2
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400

[tests]
file-loop=1

 

As soon as objects or people are detected, they receive a tracking ID. This makes the deepstream app. In addition, one would like to further process such events and, for example, achieve a connection to a monitoring system. For this, deepstream supports several IoT functions. The so-called Kafka protocol adapter allows bidirectional messaging, and recording based on an anomaly is supported as file section entry well.

 

Happy inferencing and tanzu'ifying.

View solution in original post

0 Kudos
2 Replies
DCasota
Expert
Expert
Jump to solution

With help from the NVidia user forum it works now on Photon OS.

https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new lists each GPU with its capabilities. In order to use the hardware decoding of the M60 GPU, a stream or a file in h264 format must be available. h265 is not supported.

 

Hence, in Step7 the sample file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt needs modifications.

 

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h264.mp4
gpu-id=0

[sink2] # renamed from sink0
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0

[sink0] # renamed from sink2
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400

[tests]
file-loop=1

 

As soon as objects or people are detected, they receive a tracking ID. This makes the deepstream app. In addition, one would like to further process such events and, for example, achieve a connection to a monitoring system. For this, deepstream supports several IoT functions. The so-called Kafka protocol adapter allows bidirectional messaging, and recording based on an anomaly is supported as file section entry well.

 

Happy inferencing and tanzu'ifying.

0 Kudos
DCasota
Expert
Expert
Jump to solution

Beside Triton, the NVidia tao toolkit might be interesting as well. Here some provisioning info.

 

Step 1: packages installation

Execute the following commands.

tdnf install -y wget unzip python3-pip

pip3 install virtualenvwrapper

export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3

source /usr/bin/virtualenvwrapper.sh

mkvirtualenv -p /usr/bin/python3 launcher

pip3 install jupyterlab

pip3 install nvidia-tao

 

Install the NVidia GPU Cloud cli as well.

# NGC installation with md5 check output

wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc

find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5

echo "export PATH=\"\$PATH:$(pwd)/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile

 

Step 2: download the tao samples

Now we download the tao samples with pretrained ML models.

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/cv_samples/versions/v1.2.0/zip -O cv_samples_v1.2.0.zip

unzip -u cv_samples_v1.2.0.zip  -d ./cv_samples_v1.2.0

cd ./cv_samples_v1.2.0

mkdir ./cv_samples_v1.2.0/detectnet_v2/data

 

For later use, we create two directories as well.

mkdir -p /workspace/tao-experiments/detectnet_v2

mkdir -p /workspace/tao-experiments/data/training

 

The tao samples contain jupyter notebook files which need data object images and label data. There are two files which can be obtained from

 http://www.cvlibs.net/download.php?file=data_object_image_2.zip

DCasota_0-1662233014398.png

The zip file size is 11.7GB.

http://www.cvlibs.net/download.php?file=data_object_label_2.zip
The zip file size is 5.5MB.

Simply copy the zip files e.g. per winscp to the photon os vm into the directory ./cv_samples_v1.2.0/detectnet_v2/data .

 

Step 3: docker login

docker login nvcr.io

DCasota_1-1662233330076.png

If the docker daemon is not started, execute systemctl start docker.

The docker hub login is used in a jupyter notebook.

Step 4: yupiter notebook configuration

The jupyter notebook will be published on port 8888. Hence we have to open that port.

iptables -A INPUT -i eth0 -p tcp --dport 8888 -j ACCEPT

iptables -A OUTPUT -p tcp --dport 8888 -j ACCEPT

iptables-save >/etc/systemd/scripts/ip4save

 

Start the jupyter notebook .

cd ./cv_samples_v1.2.0/

/root/.virtualenvs/launcher/bin/jupyter notebook --allow-root --ip 0.0.0.0 --no-browser

On the screen you will get a similar output as below. 

[W 2022-08-16 14:57:33.899 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.

[W 2022-08-16 14:57:33.906 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.

[W 2022-08-16 14:57:33.911 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.

[I 2022-08-16 14:57:33.925 LabApp] JupyterLab extension loaded from /root/.virtualenvs/launcher/lib/python3.10/site-packages/jupyterlab

[I 2022-08-16 14:57:33.929 LabApp] JupyterLab application directory is /root/.virtualenvs/launcher/share/jupyter/lab

[I 14:57:33.937 NotebookApp] Serving notebooks from local directory: /

[I 14:57:33.939 NotebookApp] Jupyter Notebook 6.4.12 is running at:

[I 14:57:33.942 NotebookApp] http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

[I 14:57:33.945 NotebookApp]  or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

[I 14:57:33.948 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

   

    To access the notebook, open this file in a browser:

        file:///root/.local/share/jupyter/runtime/nbserver-873-open.html

    Or copy and paste one of these URLs:

        http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

     or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

 

Open a web browser and insert ip and tokenID, example:

http://20.208.40.128:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b

 

Browse to the directory CV_Samples_v1.2.0 > Detectnet_v2 .

DCasota_2-1662233742052.png

Open the jupyter notebook file detectnet_v2.ipynb.

Edit the section with the environment variables for the LOCAL_PROJECT_DIR path:

DCasota_3-1662233819137.png

Because the download url for the images.zip and labels.zip is missing, the jupyter notebook shows an error. However, the .zip files were already copied beforehand, so the error can be ignored. Unpacking the zip files takes a while. 

You can step through the section now. Adopt the jupyter notebook for own purposes. Using the pretrained data, here the assignments and number of existing images.

b'car': 4129
b'dontcare': 1574
b'truck': 145
b'cyclist': 226
b'misc': 118
b'pedestrian': 638
b'tram': 67
b'van': 377
b'person_sitting': 23

 

The dev setup with root privileges is not intended to be used outside a lab environment. So far, all the Nvidia triton and tao material works flawlessly on Photon OS. 

0 Kudos