Hi everyone,
How it started
I'm trying to get some traction on AI inference use cases on Photon OS.
An idea was to get baken a super easy .ova that gives the cabability to stream e.g. from a file or to livestream an iPhone camera source and to see the visualized inference videostream on a web url. This could include vehicle and people detection and tracking.
How it goes
The deepstream tool from the inference server, an open-source-components-assembled NVidia product called Triton, does not start successfully on Photon OS and I'm not convinced that it is Photon OS issue.
Clarification from the community is the best.
Here the issue description
Starting deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt ends with
**** INFO: <bus_callback:194>: Pipeline ready**
**Error String : Feature not supported on this GPUError Code : 801**
**ERROR from nvv4l2decoder0: Failed to process frame.**
Debug info: gstv4l2videodec.c(1747): gst_v4l2_video_dec_handle_frame (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/nvv4l2decoder:nvv4l2decoder0:
**Maybe be due to not enough memory or failing driver**
**ERROR from qtdemux0: Internal data stream error.**
Debug info: qtdemux.c(6605): gst_qtdemux_loop (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstBin:src_sub_bin0/GstURIDecodeBin:src_elem/GstDecodeBin:decodebin0/GstQTDemux:qtdemux0:
**streaming stopped, reason error (-5)**
Quitting
[NvMultiObjectTracker] De-initialized
App run failed
Are the errors "Error String : Feature not supported on this GPUError Code : 801" and "ERROR from nvv4l2decoder0: Failed to process frame" Photon OS related (hardware + headless) ?
Here's the setup description.
Step 1: Hardware / Software
Step 2: Base provisioning (one reboot)
# repo url
if [ `cat /etc/yum.repos.d/photon.repo | grep -o "packages.vmware.com/photon" | wc -l` -eq 0 ]; then
cd /etc/yum.repos.d/
sudo sed -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo
fi
# update components with impact to nvidia components
tdnf update -y docker
# install kernel api headers and devel
tdnf install -y build-essential wget tar
# On vSphere comment the following line
tdnf install -y linux-devel
# On vSphere uncomment the following line
# tdnf install -y linux-esx-devel
reboot
wget https://us.download.nvidia.com/tesla/470.141.03/NVIDIA-Linux-x86_64-470.141.03.run
chmod a+x ./NVIDIA-Linux-x86_64-470.141.03.run
./NVIDIA-Linux-x86_64-470.141.03.run --kernel-source-path=/usr/lib/modules/`uname -r`/build --ui=none --no-questions --accept-license
# Check GPU driver
root@NVidia01 [ / ]# nvidia-smi
Wed Aug 24 07:02:55 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03 Driver Version: 470.141.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla M60 Off | 00003130:00:00.0 Off | Off |
| N/A 34C P0 36W / 150W | 0MiB / 8129MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Step 3: install NVidia Container toolkit to run NVidia docker container on Photon OS
tdnf install -y gpg
cd /etc/pki/rpm-gpg/
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /etc/pki/rpm-gpg/nvidia-container-toolkit-keyring.gpg
cat << EOF >>/etc/yum.repos.d/nvidia-container-toolkit.repo
[libnvidia-container]
name=libnvidia-container
baseurl=https://nvidia.github.io/libnvidia-container/centos7/x86_64
gpgcheck=0
enabled=1
EOF
tdnf makecache
tdnf install nvidia-container-toolkit
systemctl restart docker
rm /etc/yum.repos.d/nvidia-container-toolkit.repo
Step 4: Open some ports
iptables -A INPUT -i eth0 -p udp --dport 5400 -j ACCEPT
iptables -A OUTPUT -p udp --dport 5400 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8000 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8000 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8001 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8001 -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --dport 8002 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8002 -j ACCEPT
iptables-save >/etc/systemd/scripts/ip4save
Step 5: Pull docker container and start it
docker run --gpus all -it --rm --net=host nvcr.io/nvidia/deepstream:6.1-triton -p 8000:8000/tcp -p 8001:8001/tcp -p 8002:8002/tcp -p 5400:5400/udp
Step 6: Inside docker container: Download configuration files and models
git clone https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps.git
cd /opt/nvidia/deepstream/deepstream-6.1/deepstream_reference_apps/deepstream_app_tao_configs
cp -a * /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
apt-get install -y wget zip
cd /opt/nvidia/deepstream/deepstream-6.1/samples/configs/tao_pretrained_models/
./download_models.sh
Step 7: Configure file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
In deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt the group [sink0] has been deactivated (enable=0), and [sink2] has been activated (enable=2). Additionally in [sink2], codec=2 #2=h265 has been set because of [source0] uri=file://../../streams/sample_1080p_h265.mp4.
Step 8: Start the deepstream app
deepstream-app -c deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
With help from the NVidia user forum it works now on Photon OS.
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new lists each GPU with its capabilities. In order to use the hardware decoding of the M60 GPU, a stream or a file in h264 format must be available. h265 is not supported.
Hence, in Step7 the sample file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt needs modifications.
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h264.mp4
gpu-id=0
[sink2] # renamed from sink0
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0
[sink0] # renamed from sink2
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400
[tests]
file-loop=1
As soon as objects or people are detected, they receive a tracking ID. This makes the deepstream app. In addition, one would like to further process such events and, for example, achieve a connection to a monitoring system. For this, deepstream supports several IoT functions. The so-called Kafka protocol adapter allows bidirectional messaging, and recording based on an anomaly is supported as file section entry well.
Happy inferencing and tanzu'ifying.
With help from the NVidia user forum it works now on Photon OS.
https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new lists each GPU with its capabilities. In order to use the hardware decoding of the M60 GPU, a stream or a file in h264 format must be available. h265 is not supported.
Hence, in Step7 the sample file deep-stream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt needs modifications.
[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
num-sources=1
uri=file://../../streams/sample_1080p_h264.mp4
gpu-id=0
[sink2] # renamed from sink0
enable=0
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=1
source-id=0
gpu-id=0
[sink0] # renamed from sink2
enable=1
#Type - 1=FakeSink 2=EglSink 3=File 4=RTSPStreaming 5=Overlay
type=4
#1=h264 2=h265
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=4000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
# set below properties in case of RTSPStreaming
rtsp-port=8554
udp-port=5400
[tests]
file-loop=1
As soon as objects or people are detected, they receive a tracking ID. This makes the deepstream app. In addition, one would like to further process such events and, for example, achieve a connection to a monitoring system. For this, deepstream supports several IoT functions. The so-called Kafka protocol adapter allows bidirectional messaging, and recording based on an anomaly is supported as file section entry well.
Happy inferencing and tanzu'ifying.
Beside Triton, the NVidia tao toolkit might be interesting as well. Here some provisioning info.
Step 1: packages installation
Execute the following commands.
tdnf install -y wget unzip python3-pip
pip3 install virtualenvwrapper
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/bin/virtualenvwrapper.sh
mkvirtualenv -p /usr/bin/python3 launcher
pip3 install jupyterlab
pip3 install nvidia-tao
Install the NVidia GPU Cloud cli as well.
# NGC installation with md5 check output
wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc
find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5
echo "export PATH=\"\$PATH:$(pwd)/ngc-cli\"" >> ~/.bash_profile && source ~/.bash_profile
Step 2: download the tao samples
Now we download the tao samples with pretrained ML models.
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/cv_samples/versions/v1.2.0/zip -O cv_samples_v1.2.0.zip
unzip -u cv_samples_v1.2.0.zip -d ./cv_samples_v1.2.0
cd ./cv_samples_v1.2.0
mkdir ./cv_samples_v1.2.0/detectnet_v2/data
For later use, we create two directories as well.
mkdir -p /workspace/tao-experiments/detectnet_v2
mkdir -p /workspace/tao-experiments/data/training
The tao samples contain jupyter notebook files which need data object images and label data. There are two files which can be obtained from
http://www.cvlibs.net/download.php?file=data_object_image_2.zip
The zip file size is 11.7GB.
http://www.cvlibs.net/download.php?file=data_object_label_2.zip
The zip file size is 5.5MB.
Simply copy the zip files e.g. per winscp to the photon os vm into the directory ./cv_samples_v1.2.0/detectnet_v2/data .
Step 3: docker login
docker login nvcr.io
If the docker daemon is not started, execute systemctl start docker.
The docker hub login is used in a jupyter notebook.
Step 4: yupiter notebook configuration
The jupyter notebook will be published on port 8888. Hence we have to open that port.
iptables -A INPUT -i eth0 -p tcp --dport 8888 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 8888 -j ACCEPT
iptables-save >/etc/systemd/scripts/ip4save
Start the jupyter notebook .
cd ./cv_samples_v1.2.0/
/root/.virtualenvs/launcher/bin/jupyter notebook --allow-root --ip 0.0.0.0 --no-browser
On the screen you will get a similar output as below.
[W 2022-08-16 14:57:33.899 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-08-16 14:57:33.906 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-08-16 14:57:33.911 LabApp] 'allow_root' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2022-08-16 14:57:33.925 LabApp] JupyterLab extension loaded from /root/.virtualenvs/launcher/lib/python3.10/site-packages/jupyterlab
[I 2022-08-16 14:57:33.929 LabApp] JupyterLab application directory is /root/.virtualenvs/launcher/share/jupyter/lab
[I 14:57:33.937 NotebookApp] Serving notebooks from local directory: /
[I 14:57:33.939 NotebookApp] Jupyter Notebook 6.4.12 is running at:
[I 14:57:33.942 NotebookApp] http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
[I 14:57:33.945 NotebookApp] or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
[I 14:57:33.948 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-873-open.html
Or copy and paste one of these URLs:
http://ph01:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
or http://127.0.0.1:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
Open a web browser and insert ip and tokenID, example:
http://20.208.40.128:8888/?token=1471fe6c3147473cbb94f59b8071e1caccf80a655aeca50b
Browse to the directory CV_Samples_v1.2.0 > Detectnet_v2 .
Open the jupyter notebook file detectnet_v2.ipynb.
Edit the section with the environment variables for the LOCAL_PROJECT_DIR path:
Because the download url for the images.zip and labels.zip is missing, the jupyter notebook shows an error. However, the .zip files were already copied beforehand, so the error can be ignored. Unpacking the zip files takes a while.
You can step through the section now. Adopt the jupyter notebook for own purposes. Using the pretrained data, here the assignments and number of existing images.
b'car': 4129
b'dontcare': 1574
b'truck': 145
b'cyclist': 226
b'misc': 118
b'pedestrian': 638
b'tram': 67
b'van': 377
b'person_sitting': 23
The dev setup with root privileges is not intended to be used outside a lab environment. So far, all the Nvidia triton and tao material works flawlessly on Photon OS.