MicroShift on a Jetson Nano with Ubuntu 20.04
Introduction
MicroShift is a research project that is exploring how OpenShift OKD Kubernetes distribution can be optimized for small form factor devices and edge computing. In Part 2 and Part 3 of this series, we setup the Jetson Nano with Ubuntu 18.04, built and deployed MicroShift using a Jetson Nano Developer Kit. In Part 4, Part 5 and Part 6 we worked with MicroShift on a Raspberry Pi 4. In this Part 7, we switch back to the Jetson Nano. Specifically, we will deploy MicroShift on Ubuntu 20.04 on the Jetson Nano. The Jetson Software Roadmap shows that JetPack 5.0 Developer Preview is planned for 1Q-2022 with Ubuntu 20.04. Meanwhile, we can follow the instructions from Q-engineering for the standard release upgrade mechanism or download the complete Jetson Nano with a pre-installed Ubuntu 20.04 10.3 GB image with OpenCV, TensorFlow and Pytorch.
Setting up the Jetson Nano with Ubuntu 20.04 (64 bit)
For this blog, we download the image from the Qengineering github site and write to Microsdxc card.
- Flash the image using balenaEtcher or the Raspberry Pi Imager
- Have a Keyboard, Monitor and the Ethernet Cable connected to the Jetson Nano
- Insert Microsdxc into Jetson Nano and poweron
- Login with jetson as the user and password
- Get the jetsonnano-ipaddress so that we can ssh to the Jetson Nano from your Laptop
ip a
Let’s install the latest updates, set the hostname with fqdn and configure the timezone and set the locale to enUS
ssh jetson@jetsonnano-ipaddress
sudo su -
apt-get update
apt-get -y upgrade
hostnamectl set-hostname nano.example.com
dpkg-reconfigure tzdata
sed -i "s/nl_NL/en_US/g" /etc/default/locale
#locale-gen "en_US.UTF-8"
locale-gen en_US en_US.UTF-8
dpkg-reconfigure locales
The default image is set to use a 32GB card. Fix the partition if you installed to larger microsdxc card
parted -l
F
apt-get install -y cloud-guest-utils
growpart /dev/mmcblk0 1
resize2fs /dev/mmcblk0p1
Check the JetPak, it shows the release is R32 and revision is 6.1 which gives L4T 32.6.1. The latest is JetPack 4.6 and includes L4T 32.6.1. The BOARD parameter indicates the t210ref which is a Jetson Nano Development Kit. The platform is thus t210 for NVIDIA® Jetson Nano™ devices.
root@nano:~# cat /etc/nv_tegra_release
# R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t210ref, EABI: aarch64, DATE: Mon Jul 26 19:20:30 UTC 2021
Let’s remove the ubuntu-desktop so that we get additional free memory
cat << EOF > removedesktop.sh
sudo apt-get -y purge ubuntu-desktop
sudo apt-get -y purge unity gnome-shell lightdm
sudo apt-get -y remove ubuntu-desktop
sudo apt purge ubuntu-desktop -y && sudo apt autoremove -y && sudo apt autoclean
sudo apt-get -y clean
sudo apt-get -y autoremove
sudo apt-get -f install
sudo reboot
EOF
chmod +x removedesktop.sh
./removedesktop.sh # System will reboot
Testing the Jupyter Lab container in Docker
After the reboot, ssh to Jetson Nano with jetson/jetson and install the nvidia-docker2 to avoid the “error adding seccomp filter rule for syscall clone3: permission denied: unknown”
sudo su -
apt-get install -y curl
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey|sudo apt-key add -
apt-get update
apt-get install -y nvidia-docker2 # This will also install docker.io
systemctl restart docker
Try out the Deep Learning Institute (DLI) course "Getting Started with AI on Jetson Nano" Course Environment Container with a USB camera attached.
docker run --runtime nvidia -it --rm --network host --volume ~/nvdli-data:/nvdli-nano/data --device /dev/video0 nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.6.1
Connect to your Jetson Nano ip address from your Laptop with the URL shown and login with password dlinano http://jetsonnano-ipaddress:8888/lab?
We can run the notebook /hello_camera/usb_camera.ipynb and test the camera. After testing, release the camera resource and shutdown the kernel and exit the container.
Output
root@nano:~# docker run --runtime nvidia -it --rm --network host --volume ~/nvdli-data:/nvdli-nano/data --device /dev/video0 nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.6.1
allow 10 sec for JupyterLab to start @ http://192.168.1.208:8888 (password dlinano)
JupterLab logging location: /var/log/jupyter.log (inside the container)
root@nano:/nvdli-nano# exit
exit
Additional details about attaching a fan and using a CSI-2 IMX219 camera were provided in Part 2. In this blog, we will use a USB Camera.
Installing Microshift
Clone the microshift github repo and run the install.sh script
git clone https://github.com/thinkahead/microshift.git
cd microshift
./install.sh
You will get the error:
Error: COMMAND_FAILED: '/usr/sbin/ip6tables-restore -w -n' failed: ip6tables-restore v1.8.4 (legacy): Couldn't load match `rpfilter':No such file or directory
Set the IPv6_rpfilter=no in the /etc/firewalld/firewalld.conf to fix this
sed -i "s|^IPv6_rpfilter=yes|IPv6_rpfilter=no|" /etc/firewalld/firewalld.conf
systemctl restart firewalld
Add the ssh port 22 so we are not locked out
firewall-cmd --zone=public --permanent --add-port=22/tcp
firewall-cmd --reload
NVML (and therefore nvidia-smi) is not currently supported on Jetson. The k8s-device-plugin does not work with Jetson and the nvidia/k8s-device-plugin container image is not available for arm64. So, let’s setup cri-o to directly use nvidia container runtime hook.
mkdir -p /usr/share/containers/oci/hooks.d/
cat << EOF > /usr/share/containers/oci/hooks.d/nvidia.json
{
"version": "1.0.0",
"hook": {
"path": "/usr/bin/nvidia-container-runtime-hook",
"args": ["nvidia-container-runtime-hook", "prestart"],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin "
]
},
"when": {
"always": true,
"commands": [".*"]
},
"stages": ["prestart"]
}
EOF
The metacopy=on is not supported. Update the /etc/containers/storage.conf and restart crio.
#mountopt = "nodev,metacopy=on"
sed -i "s/,metacopy=on//" /etc/containers/storage.conf
systemctl restart crio
./install.sh
Install the oc client - We can download the required version of oc client for arm64.
wget https://mirror.openshift.com/pub/openshift-v4/arm64/clients/ocp/candidate/openshift-client-linux.tar.gz
mkdir tmp;cd tmp
tar -zxvf ../openshift-client-linux.tar.gz
mv -f oc /usr/local/bin
cd ..;rm -rf tmp
rm -f openshift-client-linux.tar.gz
It will take around 3 minutes for all pods to start. Check the status of node and pods using kubectl or oc client.
export KUBECONFIG=/var/lib/microshift/resources/kubeadmin/kubeconfig
watch "kubectl get nodes;kubectl get pods -A;crictl pods;crictl images"
#watch "oc get nodes;oc get pods -A;crictl pods;crictl images"
That completes the installation of MicroShift. You can skip the next section if you want to use this installation of MicroShift. If you want to build your own microshift binary, let's push on.
Build the MicroShift binary for arm64 on Ubuntu 20.04 (64 bit)
We can replace the microshift binary that was download from the install.sh script with our own. Let’s build the microshift binary from scratch. Clone the microshift repository from github, install golang, run make and finally move the microshift binary to /usr/local/bin.
sudo su -
apt -y install build-essential curl libgpgme-dev pkg-config libseccomp-dev
# Install golang
wget https://golang.org/dl/go1.17.2.linux-arm64.tar.gz
rm -rf /usr/local/go && tar -C /usr/local -xzf go1.17.2.linux-arm64.tar.gz
rm -f go1.17.2.linux-arm64.tar.gz
export PATH=$PATH:/usr/local/go/bin
export GOPATH=/root/go
cat << EOF >> /root/.bashrc
export PATH=$PATH:/usr/local/go/bin
export GOPATH=/root/go
EOF
mkdir $GOPATH
git clone https://github.com/thinkahead/microshift.git
cd microshift
make
./microshift version
ls -las microshift # binary in current directory /root/microshift
mv microshift /usr/local/bin/microshift
systemctl restart microshift
We may alternatively download the latest version of the prebuilt microshift binary from github that the install.sh downloaded as follows:
ARCH=arm64
export VERSION=$(curl -s https://api.github.com/repos/redhat-et/microshift/releases | grep tag_name | head -n 1 | cut -d '"' -f 4) && \
curl -LO https://github.com/redhat-et/microshift/releases/download/$VERSION/microshift-linux-${ARCH}
chmod +x microshift-linux-${ARCH}
ls -las microshift-linux*
mv microshift-linux-${ARCH} /usr/local/bin/microshift
systemctl restart microshift
Samples to run on MicroShift
We will run a few samples that will show the use of persistent volume, GPU, and the USB camera.
1. InfluxDB/Telegraf/Grafana
We reuse this influxdb sample from the previous Raspberry Pi 4 Part 6. You can also follow the line-by-line instructions as reference.
cd ~
git clone https://github.com/thinkahead/microshift.git
cd microshift/raspberry-pi/influxdb
We can install all the components with the single script runall.sh
./runall.sh
Alternatively, run the steps separately and check details at each step
Create a new project influxdb
oc new-project influxdb
oc project influxdb # if it already exists
Install InfluxDB
oc create configmap influxdb-config --from-file=influxdb.conf
oc get configmap influxdb-config -o yaml
oc apply -f influxdb-secrets.yaml
oc describe secret influxdb-secrets
mkdir /var/hpvolumes/influxdb
oc apply -f influxdb-pv.yaml
oc apply -f influxdb-data.yaml
oc apply -f influxdb-deployment.yaml
oc get -f influxdb-deployment.yaml # check that the Deployment is created and ready
oc wait -f influxdb-deployment.yaml --for condition=available
oc logs deployment/influxdb-deployment -f
oc apply -f influxdb-service.yaml
oc rsh deployment/influxdb-deployment # connect to InfluxDB and display the databases
Output
root@nano:~/microshift/raspberry-pi/influxdb# oc rsh deployment/influxdb-deployment
# influx --username admin --password admin
Connected to http://localhost:8086 version 1.7.4
InfluxDB shell version: 1.7.4
Enter an InfluxQL query
> show databases
name: databases
name
----
test
_internal
> exit
# exit
Install Telegraf and check the measurements for the telegraf database in InfluxDB
oc apply -f telegraf-config.yaml
oc apply -f telegraf-secrets.yaml
oc apply -f telegraf-deployment.yaml
oc wait -f telegraf-deployment.yaml --for condition=available
Output
root@ubuntu:~/microshift/raspberry-pi/influxdb# oc rsh deployment/influxdb-deployment
# influx --username admin --password admin
Connected to http://localhost:8086 version 1.7.4
InfluxDB shell version: 1.7.4
Enter an InfluxQL query
> show databases
name: databases
name
----
test
_internal
telegraf
> use telegraf
Using database telegraf
> show measurements
name: measurements
name
----
cpu
disk
diskio
kernel
mem
net
netstat
processes
swap
system
> select * from cpu;
...
> exit
# exit
Install Grafana
cd grafana
mkdir /var/hpvolumes/grafana
cp -r config/* /var/hpvolumes/grafana/.
oc apply -f grafana-pv.yaml
oc apply -f grafana-data.yaml
oc apply -f grafana-deployment.yaml
oc apply -f grafana-service.yaml
oc expose svc grafana-service # Create the route
oc wait -f grafana-deployment.yaml --for condition=available
oc get route grafana-service
Add the " jetsonnano-ipaddress grafana-service-influxdb.cluster.local" to /etc/hosts on your laptop and login to http://grafana-service-influxdb.cluster.local (or to http://grafana-service-default.cluster.local if you deployed to default namespace) using admin/admin. You will need to change the password on first login. Go to the Dashboards list (left menu > Dashboards > Manage). The Analysis Server dashboard should be visible. Open it to display monitoring information for MicroShift.
Finally, after you are done working with this sample, delete the grafana, telegraf, influxdb.
oc delete route grafana-service
oc delete -f grafana-data.yaml -f grafana-deployment.yaml -f grafana-pv.yaml -f grafana-service.yaml
cd ..
oc delete -f telegraf-config.yaml -f telegraf-secrets.yaml -f telegraf-deployment.yaml
oc delete -f influxdb-data.yaml -f influxdb-pv.yaml -f influxdb-service.yaml -f influxdb-deployment.yaml -f influxdb-secrets.yaml
oc project default
oc delete project influxdb
rm -rf /var/hpvolumes/grafana
rm -rf /var/hpvolumes/influxdb
2. Devicequery
Create the devicequery.yaml. The Dockerfile to create the devicequery:arm64-jetsonnano image was shown earlier in Part 2 crio samples.
cat << EOF > devicequery.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: devicequery-job
spec:
parallelism: 1
completions: 1
activeDeadlineSeconds: 1800
backoffLimit: 6
template:
metadata:
labels:
app: devicequery
spec:
containers:
- name: devicequery
image: docker.io/karve/devicequery:arm64-jetsonnano
restartPolicy: OnFailure
EOF
oc apply -f devicequery.yaml
oc get job/devicequery-job
Wait for the job to be completed, the output shows that the CUDA device was detected within the container:
oc logs job/devicequery-job
Output
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA Tegra X1"
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 5.3
Total amount of global memory: 3956 MBytes (4148273152 bytes)
( 1) Multiprocessors, (128) CUDA Cores/MP: 128 CUDA Cores
GPU Max Clock rate: 922 MHz (0.92 GHz)
Memory Clock rate: 13 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 262144 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
Delete the devicequery job
oc delete -f devicequery.yaml
3. VectorAdd
Create the vectoradd.yaml. The Dockerfile for the vector-add-sample:arm64-jetsonnano image was shown earlier in Part 2 crio samples.
cat << EOF > vectoradd.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: vectoradd-job
spec:
parallelism: 1
completions: 1
activeDeadlineSeconds: 1800
backoffLimit: 6
template:
metadata:
labels:
app: vectoradd
spec:
containers:
- name: vectoradd
image: docker.io/karve/vector-add-sample:arm64-jetsonnano
restartPolicy: OnFailure
EOF
oc apply -f vectoradd.yaml
oc get job/vectoradd-job
Wait for the job to be completed, the output shows that the vector addition of 50000 elements on the CUDA device:
oc logs job/vectoradd-job
Output
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
Delete the vectoradd job
oc delete -f vectoradd.yaml
4. Jupyter Lab to access USB camera on /dev/video0
Create the following jupyter.yaml, create the deployment, service and route
cat << EOF > jupyter.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: jupyter-deployment
spec:
selector:
matchLabels:
app: jupyter
replicas: 1
template:
metadata:
labels:
app: jupyter
spec:
containers:
- name: jupyter
image: nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.6.1
imagePullPolicy: IfNotPresent
command: ["/bin/bash", "-c", "jupyter lab --LabApp.token='' --LabApp.password='' --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log && sleep infinity"]
securityContext:
privileged: true
#allowPrivilegeEscalation: false
#capabilities:
# drop: ["ALL"]
ports:
- containerPort: 8888
# resource required for hpa
resources:
requests:
memory: 128M
cpu: 125m
limits:
memory: 2048M
cpu: 1000m
volumeMounts:
- name: dev-video0
mountPath: /dev/video0
volumes:
- name: dev-video0
hostPath:
path: /dev/video0
---
apiVersion: v1
kind: Service
metadata:
name: jupyter-svc
labels:
app: jupyter
spec:
type: NodePort
ports:
- port: 8888
nodePort: 30080
selector:
app: jupyter
EOF
oc apply -f jupyter.yaml
oc expose svc jupyter-svc
Now we can add the line with ipaddress of the Jetson Nano with jupyter-svc-default.cluster.local to the /etc/hosts on your laptop/MacBook Pro and access the jupyterlab at http://jupyter-svc-default.cluster.local/lab?
Navigate to the hello_camera/usb_camera.ipynb and run the notebook.
We can delete the jupyterlab with:
oc delete route jupyter-svc
oc delete -f jupyter.yaml
5. Install Metrics Server
This will enable us to run the “kubectl top” and “oc adm top” commands.
kubectl apply -f https://raw.githubusercontent.com/thinkahead/microshift/main/jetson-nano/tests/metrics/metrics-components.yaml
If the metrics-server keeps restarting and the pod logs show the no route to host error, you may need to add the hostNetwork: true. When a pod is configured with hostNetwork: true, the applications running in such a pod can directly see the network interfaces of the host machine where the pod was started.
E1220 19:36:20.224466 1 server.go:132] unable to fully scrape metrics: unable to fully scrape metrics from node nano.example.com: unable to fetch metrics from node nano.example.com: Get "https://192.168.1.208:10250/stats/summary?only_cpu_and_memory=true": dial tcp 192.168.1.208:10250: connect: no route to host
Edit the deployment and add the line with “hostNetwork: true” within spec.template.spec
oc edit deployments -n kube-system metrics-server
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
labels:
k8s-app: metrics-server
spec:
hostNetwork: true
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --kubelet-use-node-status-port
- --v=6
image: k8s.gcr.io/metrics-server/metrics-server:v0.4.0
# Wait for the metrics-server to start in the kube-system namespace
kubectl get deployment metrics-server -n kube-system
kubectl get events -n kube-system
kubectl logs deployment/metrics-server -n kube-system -f # Wait until the “metric-storage-ready failed: not metrics to serve” error stops
# Wait for a couple of minutes for metrics to be collected
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
apt-get install -y jq
kubectl get --raw /api/v1/nodes/$(kubectl get nodes -o json | jq -r '.items[0].metadata.name')/proxy/stats/summary
# Wait for a couple of minutes for metrics to be collected
kubectl top nodes;kubectl top pods -A
oc adm top nodes;oc adm top pods -A
watch "kubectl top nodes;kubectl top pods -A"
Output
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
nano.example.com 902m 22% 2220Mi 57%
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system metrics-server-dbf765b9b-8p6wr 15m 17Mi
kubevirt-hostpath-provisioner kubevirt-hostpath-provisioner-fsmkm 2m 9Mi
openshift-dns dns-default-lqktl 10m 25Mi
openshift-dns node-resolver-d95pz 0m 7Mi
openshift-ingress router-default-85bcfdd948-khkpd 6m 38Mi
openshift-service-ca service-ca-76674bfb58-rkcm5 16m 43Mi
6. Object Detection demo with GPU to send pictures and web socket messages to Node Red
The Object Detection sample will detect objects. When a person is detected, it will send a Web Socket message with the bounding box information and a picture to Node Red.
Let’s install Node Red on IBM Cloud. We will use Node Red to show pictures and chat messages sent from the Jetson Nano. Alternatively, we can use the Node Red that we deployed as an application in MicroShift on the MacBook Pro in VirtualBox in Part 1.
- Create an IBM Cloud free tier account at https://www.ibm.com/cloud/free and login to Console (top right).
- Create an API Key and save it, Manage->Access->IAM->API Key->Create an IBM Cloud API Key
- Click on Catalog and Search for "Node-Red App", select it and click on "Get Started"
- Give a unique App name, for example xxxxx-node-red and select the region nearest to you
- Select the Pricing Plan Lite, if you already have an existing instance of Cloudant, you may select it in Pricing Plan
- Click Create
- Under Deployment Automation -> Configure Continuous Delivery, click on "Deploy your app"
- Select the deployment target Cloud Foundry that provides a Free-Tier of 256 MB cost-free or Code Engine. The latter has monthly limits and takes more time to deploy. [ Note: Cloud Foundry is deprecated, use the IBM Cloud Code Engine. Any IBM Cloud Foundry application runtime instances running IBM Cloud Foundry applications will be permanently disabled and deprovisioned ]
- Enter the IBM Cloud API Key from Step 2, or click on "New" to create one
- The rest of the fields Region, Organization, Space will automatically get filled up. Use the default 256MB Memory and click "Next"
- In "Configure the DevOps toolchain", click Create
- Wait for 10 minutes for the Node Red instance to start
- Click on the "Visit App URL"
- On the Node Red page, create a new userid and password
- In Manage Palette, install the node-red-contrib-image-tools, node-red-contrib-image-output, and node-red-node-base64
- Import the Chat flow and the Picture (Image) display flow. On the Chat flow, you will need to edit the template node line 35 to use wss:// (on IBM Cloud) instead of ws:// (on your Laptop)
- On another browser tab, start the https://mynodered.mybluemix.net/chat (Replace mynodered with your IBM Cloud Node Red URL)
- On the Image flow, click on the square box to the right of image preview or viewer to Deactivate and Activate the Node. You will be able to see the picture when you Activate the Node
cd ~
git clone https://github.com/thinkahead/microshift.git
cd ~/microshift/jetson-nano/tests/object-detection
Build the image
docker build -t docker.io/karve/jetson-inference:r32.6.1 .
docker push docker.io/karve/jetson-inference:r32.6.1
You can update the WebSocketURL, ImageUploadURL and VideoSource in inference.yaml to point to your video source and URLs in Node Red on IBM Cloud or to the Node Red you installed in Microshift on your Laptop. For the latter, you will need to add hostAliases with the ip address of your Laptop. Then, create the deployment with the oc apply command and look at the Chat application and the Picture flow started in Node Red. It will take a couple of minutes to initially load the model.
crictl pull docker.io/karve/jetson-inference:r32.6.1 # Optional
oc apply -f inference.yaml
To stop this object-detection sample, we can delete the deployment
oc delete -f inference.yaml
7. Object Detection demo with TensorFlow Lite (no GPU) to send pictures and web socket messages to Node Red
cd ~
git clone https://github.com/thinkahead/microshift.git
cd ~/microshift/jetson-nano/tests/object-detection-no-gpu
This example uses TensorFlow Lite with Python on a Raspberry Pi to perform real-time object detection using images streamed from the USB Camera. It draws a bounding box around each detected object when the object score is above a given threshold.
a. Use a container
Build the object-detection-jetson-nano image and check that we can access the Sense Hat and the camera and run the tensorflow lite from a container in docker.
cp ~/microshift/raspberry-pi/object-detection/efficientdet_lite0.tflite .
docker build -t docker.io/karve/object-detection-jetsonnano .
docker push docker.io/karve/object-detection-jetsonnano:latest
docker run --rm -d --privileged -e ImageUploadURL=http://yournodered.mybluemix.net/upload -e WebSocketURL=wss://yournodered.mybluemix.net/ws/chat docker.io/karve/object-detection-jetsonnano:latest
You should see the camera feed appear on the Node Red image viewer if the image has a person. Put some objects in front of the camera, like a coffee mug or keyboard, and you'll see boxes drawn around those that the model recognizes, including the label and score for each. It also prints the number of frames per second (FPS) at the top-left corner of the screen.
b. Use microshift
sed -i "s|mynodered.mybluemix.net|yournodered.mybluemix.net|" *.yaml
oc apply -f object-detection.yaml
We will see pictures being sent to Node Red image viewer with a person is detected. When we are done testing, we can delete the deployment
oc delete -f object-detection.yaml
Smarter-Device-Manager
Applications running inside a container do not have access to device drivers unless explicitly given access. Smarter-device-manager enables containers deployed using Kubernetes to access devices (device drivers) available on the node. In the object detection sample above, we used the deployment with securityContext privileged. We want to avoid the privileged. With docker, we can use --device /dev/video0:/dev/video0. In Kubernetes, we don’t have --device. Instead of using the securityContext with privileged: true, we can use the smarter-device-manager without the privileged in the Object Detection demo from above. The inference-sdm.yaml shows the modified deployment. The daemonset and configmap for the smarter-device-manager need to be created in some namespace (we use sdm).
We first install the smarter device manager and label the node to enable it.
cd ~/microshift/jetson-nano/tests/object-detection
oc apply -f smarter-device-manager-ds.yaml -f video0-configmap.yaml
oc label node nano.example.com smarter-device-manager=enabled
oc get ds,pods -n sdm
Output:
root@nano:~/microshift/jetson-nano/tests/object-detection# oc apply -f smarter-device-manager-ds.yaml -f video0-configmap.yaml
namespace/sdm created
daemonset.apps/smarter-device-manager created
configmap/smarter-device-manager created
root@nano:~/microshift/jetson-nano/tests/object-detection# oc label node nano.example.com smarter-device-manager=enabled --overwrite
node/nano.example.com labeled
root@nano:~/microshift/jetson-nano/tests/object-detection# oc logs -n sdm ds/smarter-device-manager
root@nano:~/microshift/jetson-nano/tests/object-detection# oc get ds,pods -n sdm
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/smarter-device-manager 1 1 1 1 1 smarter-device-manager=enabled 25s
NAME READY STATUS RESTARTS AGE
pod/smarter-device-manager-jm9dh 1/1 Running 0 25s
We can see the Capacity (20), Allocatable (20) and Allocated(0) for smarter-devices/video0 (along with other devices).
root@nano:~/microshift/jetson-nano/tests/object-detection# oc describe nodes
Name: nano.example.com
…
Capacity:
cpu: 4
ephemeral-storage: 59964524Ki
hugepages-2Mi: 0
memory: 4051048Ki
…
smarter-devices/video0: 20
Allocatable:
cpu: 4
ephemeral-storage: 55263305227
hugepages-2Mi: 0
memory: 3948648Ki
…
smarter-devices/video0: 20
Allocated resources:
…
smarter-devices/video0 0 0
Now we can create the new deployment
root@nano:~/microshift/jetson-nano/tests/object-detection# oc apply -f inference-sdm.yaml
deployment.apps/inference-deployment created
If we describe the node again, we will see that the video0 has been allocated in “Allocated resources” and we will see the pictures and web socket messages being sent to Node Red
root@nano:~/microshift/jetson-nano/tests/object-detection# oc describe nodes
Name: microshift.example.com
…
Allocated resources:
…
smarter-devices/video0 1 1
Let’s delete the deployment. After it is deleted, the “Allocated resources” Requests and Limits go back to 0.
root@nano:~/microshift/jetson-nano/tests/object-detection# oc delete -f inference-sdm.yaml
deployment.apps "inference-deployment" deleted
root@nano:~/microshift/jetson-nano/tests/object-detection# oc describe nodes
Name: microshift.example.com
…
Allocated resources:
…
smarter-devices/video0 0 0
If we disable the smarter-device-manager on the node and try the deployment again, the pod will remain in STATUS=Pending
root@nano:~/microshift/jetson-nano/tests/object-detection# oc label node nano.example.com smarter-device-manager=disabled --overwrite
node/microshift.example.com labeled
root@nano:~/microshift/jetson-nano/tests/object-detection# oc apply -f inference-sdm.yaml
deployment.apps/inference-deployment created
root@nano:~/microshift/jetson-nano/tests/object-detection# oc get pods,deploy
NAME READY STATUS RESTARTS AGE
pod/inference-deployment-757d7c848c-nb5bt 0/1 Pending 0 69s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/inference-deployment 0/1 1 0 69s
We need to enable the label again for the deployment to get to Ready state.
root@nano:~/microshift/jetson-nano/tests/object-detection# oc label node microshift.example.com smarter-device-manager=enabled --overwrite
node/microshift.example.com labeled
root@nano:~/microshift/jetson-nano/tests/object-detection# oc get pods,deploy
NAME READY STATUS RESTARTS AGE
pod/inference-deployment-757d7c848c-nb5bt 1/1 Running 0 3m4s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/inference-deployment 1/1 1 1 3m4s
Finally, we can delete the sample and the daemonset smarter-device-manager.
root@nano:~/microshift/jetson-nano/tests/object-detection# oc delete -f inference-sdm.yaml
deployment.apps "inference-deployment" deleted
root@nano:~/microshift/jetson-nano/tests/object-detection# oc delete -f smarter-device-manager-ds.yaml -f video0-configmap.yaml
namespace "sdm" deleted
daemonset.apps "smarter-device-manager" deleted
configmap "smarter-device-manager" deleted
Using the NVIDIA/k8s-device-plugin
You may download the preconfigured nvidia-device-plugin.yml that points to precreated image and skip to “Apply” it below or build the plugin. To build, we can use the instructions from NVIDIA K8s Device Plugin for Wind River Linux to create a custom device plugin that allows the cluster to expose the number of GPUs on NVIDIA Jetson devices. The patch checks for the file /sys/module/tegra_fuse/parameters/tegra_chip_id and does not perform health checks for Jetson.
Build
git clone -b 1.0.0-beta6 https://github.com/NVIDIA/k8s-device-plugin.git
cd ../k8s-device-plugin/
wget https://labs.windriver.com/downloads/0001-arm64-add-support-for-arm64-architectures.patch
wget https://labs.windriver.com/downloads/0002-nvidia-Add-support-for-tegra-boards.patch
wget https://labs.windriver.com/downloads/0003-main-Add-support-for-tegra-boards.patch
git am 000*.patch
sed "s/ubuntu:16.04/ubuntu:18.04/" docker/arm64/Dockerfile.ubuntu16.04 > docker/arm64/Dockerfile.ubuntu18.04
docker build -t karve/k8s-device-plugin:1.0.0-beta6 -f docker/arm64/Dockerfile.ubuntu18.04 .
docker push karve/k8s-device-plugin:1.0.0-beta6
sed -i "s|image: .*|image: karve/k8s-device-plugin:1.0.0-beta6|" nvidia-device-plugin.yml # Change the image to karve/k8s-device-plugin:1.0.0-beta6
Apply
oc apply -f nvidia-device-plugin.yml
oc get ds -n kube-system nvidia-device-plugin-daemonset
Output
root@nano:~/k8s-device-plugin# oc get ds -n kube-system nvidia-device-plugin-daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
nvidia-device-plugin-daemonset 1 1 1 1 1 <none> 7h20m
With the daemonset deployed, NVIDIA GPUs can now be requested by a container using the nvidia.com/gpu resource type. The “oc describe nodes” now shows the nvidia.com/gpu Capacity, Allocatable, and Allocated resources. If we deploy the vector-add job with the resource limit, we will see in the events that only one job gets scheduled at a time even though parallelism was set to 5. When one job finishes, the next one runs.
cd ~/microshift/jetson-nano/jobs
oc apply -f vectoradd-gpu-limit.yaml
Output
root@nano:~/microshift/jetson-nano/jobs# oc apply -f vectoradd-gpu-limit.yaml
job.batch/vectoradd-job created
root@nano:~/microshift/jetson-nano/jobs# oc get events -n default
LAST SEEN TYPE REASON OBJECT MESSAGE
33s Warning FailedScheduling pod/vectoradd-job-7n2xz 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
19s Warning FailedScheduling pod/vectoradd-job-7n2xz 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
34s Normal Scheduled pod/vectoradd-job-l9cjw Successfully assigned default/vectoradd-job-l9cjw to microshift.example.com
24s Normal Pulled pod/vectoradd-job-l9cjw Container image "docker.io/karve/vector-add-sample:arm64-jetsonnano" already present on machine
21s Normal Created pod/vectoradd-job-l9cjw Created container vectoradd
21s Normal Started pod/vectoradd-job-l9cjw Started container vectoradd
33s Warning FailedScheduling pod/vectoradd-job-tnmvs 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
19s Warning FailedScheduling pod/vectoradd-job-tnmvs 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
33s Warning FailedScheduling pod/vectoradd-job-wtgnn 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
19s Warning FailedScheduling pod/vectoradd-job-wtgnn 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
7s Normal Scheduled pod/vectoradd-job-wtgnn Successfully assigned default/vectoradd-job-wtgnn to microshift.example.com
34s Warning FailedScheduling pod/vectoradd-job-zwjfs 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
32s Warning FailedScheduling pod/vectoradd-job-zwjfs 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
19s Normal Scheduled pod/vectoradd-job-zwjfs Successfully assigned default/vectoradd-job-zwjfs to microshift.example.com
9s Normal Pulled pod/vectoradd-job-zwjfs Container image "docker.io/karve/vector-add-sample:arm64-jetsonnano" already present on machine
9s Normal Created pod/vectoradd-job-zwjfs Created container vectoradd
8s Normal Started pod/vectoradd-job-zwjfs Started container vectoradd
34s Normal SuccessfulCreate job/vectoradd-job Created pod: vectoradd-job-l9cjw
34s Normal SuccessfulCreate job/vectoradd-job Created pod: vectoradd-job-wtgnn
34s Normal SuccessfulCreate job/vectoradd-job Created pod: vectoradd-job-zwjfs
34s Normal SuccessfulCreate job/vectoradd-job Created pod: vectoradd-job-7n2xz
34s Normal SuccessfulCreate job/vectoradd-job Created pod: vectoradd-job-tnmvs
Cleanup MicroShift
We can use the script available on github to cleanup the pods and images. If you already cloned the microshift repo from github, you have the script in the ~/microshift/hack directory.
wget https://raw.githubusercontent.com/thinkahead/microshift/main/hack/cleanup.sh
bash ./cleanup.sh
If MicroShift is not stopped cleanly, we are left with mounted volumes and subPaths in the /var/kubelet/pods directory from pods as follows:
tmpfs on /var/lib/kubelet/pods/c6b82b4a-0047-493b-9cb7-58fc0e1aa57b/volumes/kubernetes.io~projected/kube-api-access-7mwm6 type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/9d0482f5-574a-4f35-9078-37b0bb093606/volumes/kubernetes.io~projected/kube-api-access-z2t58 type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/ead85661-1036-49f9-adb0-8d8a13192d33/volumes/kubernetes.io~projected/kube-api-access-49vp6 type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/cc30d05e-34e8-4a5b-9727-48a044bbe7e9/volumes/kubernetes.io~projected/kube-api-access-q4n7k type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/0886f052-6bfb-47d5-83fc-c3d2bf9178d7/volumes/kubernetes.io~projected/kube-api-access-gftv9 type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/16e47221-6360-4788-a40e-efe289ef19c1/volumes/kubernetes.io~secret/signing-key type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/16e47221-6360-4788-a40e-efe289ef19c1/volumes/kubernetes.io~projected/kube-api-access-d8s86 type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/ead85661-1036-49f9-adb0-8d8a13192d33/volumes/kubernetes.io~secret/metrics-tls type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/0886f052-6bfb-47d5-83fc-c3d2bf9178d7/volumes/kubernetes.io~secret/default-certificate type tmpfs (rw,relatime)
tmpfs on /var/lib/kubelet/pods/3ecb8db3-0a07-4c4c-a2c3-bd50ee9a403e/volumes/kubernetes.io~projected/kube-api-access-dg6kr type tmpfs (rw,relatime)
/dev/mmcblk0p1 on /var/lib/kubelet/pods/3ecb8db3-0a07-4c4c-a2c3-bd50ee9a403e/volume-subpaths/influxdb-config/influxdb/1 type ext4 (rw,relatime,data=ordered)
tmpfs on /var/lib/kubelet/pods/e5fd0e09-154a-4f2f-aee3-c78ab113f116/volumes/kubernetes.io~projected/kube-api-access-2ghr4 type tmpfs (rw,relatime)
/dev/mmcblk0p1 on /var/lib/kubelet/pods/e5fd0e09-154a-4f2f-aee3-c78ab113f116/volume-subpaths/telegraf-config/telegraf/0 type ext4 (rw,relatime,data=ordered)
tmpfs on /var/lib/kubelet/pods/94bea181-015e-4e28-86ed-9748e311a8d9/volumes/kubernetes.io~projected/kube-api-access-hzzfg type tmpfs (rw,relatime)
/dev/mmcblk0p1 on /var/lib/kubelet/pods/94bea181-015e-4e28-86ed-9748e311a8d9/volume-subpaths/grafana-volume/grafana/0 type ext4 (rw,relatime,data=ordered)
/dev/mmcblk0p1 on /var/lib/kubelet/pods/94bea181-015e-4e28-86ed-9748e311a8d9/volume-subpaths/grafana-volume/grafana/1 type ext4 (rw,relatime,data=ordered)
/dev/mmcblk0p1 on /var/lib/kubelet/pods/94bea181-015e-4e28-86ed-9748e311a8d9/volume-subpaths/grafana-volume/grafana/2 type ext4 (rw,relatime,data=ordered)
The cleanup.sh also deletes these volumes to prevent errors with orphaned pods when you restart MicroShift.
echo "Unmounting /var/lib/kubelet/pods/..."
mount | grep "^tmpfs.* on /var/lib/kubelet/pods/" | awk "{print \$3}" | xargs -n1 -r umount
mount | grep "^/dev/.* on /var/lib/kubelet/pods/" | awk "{print \$3}" | xargs -n1 -r umount
rm -rf /var/lib/kubelet/pods/*
It also deletes the directories in /var/hpvolumes used for persistent volumes by kubevirt-hostpath-provisioner.
rm -rf /var/hpvolumes/*
Containerized MicroShift
We can run MicroShift within containers in two ways:
- MicroShift Containerized – The MicroShift binary runs in a Docker container, CRI-O Systemd service runs directly on the host and data is stored at /var/lib/microshift and /var/lib/kubelet on the host VM.
- MicroShift Containerized All-In-One – The MicroShift binary and CRI-O service run within a Docker container and data is stored in a docker volume, microshift-data. This should be used for “Testing and Development” only. The image available in the registry is not setup to use the GPU within the container with cri-o.
Since we currently cannot use the GPU in the latter, we do not use the All-In-One image. For the first approach, CRI-O runs on the host. We already setup CRI-O on the host to use the Nvidia container runtime and will therefore use the first approach that allows the GPU. We will build the image using docker with the Dockerfile.jetsonnano.containerized (from registry.access.redhat.com/ubi8/ubi-init:8.4) and Dockerfile.jetsonnano.containerized2 (from registry.access.redhat.com/ubi8/ubi-minimal:8.4). Note that we use the iptables-1.6.2 that is compatible with iptables on Jetson Nano with Ubuntu 18.04 instead of the iptables v1.8.7 that causes the error “iptables v1.8.7 (nf_tables) Could not fetch rule set generation id: Invalid argument”. Copy the microshift binary that we built earlier to the local directory and run the docker build command as shown below:
cat << EOF > Dockerfile.jetsonnano.containerized
ARG IMAGE_NAME=registry.access.redhat.com/ubi8/ubi-init:8.4
ARG ARCH
FROM ${IMAGE_NAME}
COPY microshift /usr/bin/microshift
RUN chmod +x /usr/bin/microshift
RUN dnf install -y libnetfilter_conntrack libnfnetlink && \
rpm -v -i --force https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/aarch64/os/Packages/i/iptables-libs-1.6.2-2.fc28.aarch64.rpm \
https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/aarch64/os/Packages/i/iptables-1.6.2-2.fc28.aarch64.rpm
ENTRYPOINT ["/usr/bin/microshift"]
CMD ["run"]
EOF
cp `which microshift` .
docker build -t docker.io/karve/microshift:jetson-nano-containerized -f Dockerfile.jetsonnano.containerized .
docker build -t docker.io/karve/microshift:jetson-nano-containerized2 -f Dockerfile.jetsonnano.containerized2 .
Check the sizes of the images produced for both:
root@nano:~/microshift/hack/all-in-one# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
karve/microshift jetson-nano-containerized2 a57e64b63a9a 4 seconds ago 562MB
karve/microshift jetson-nano-containerized 1e152d79ff65 3 minutes ago 616MB
Run the microshift container
IMAGE=docker.io/karve/microshift:jetson-nano-containerized
docker run --rm --ipc=host --network=host --privileged -d --name microshift -v /var/run:/var/run -v /sys:/sys:ro -v /var/lib:/var/lib:rw,rshared -v /lib/modules:/lib/modules -v /etc:/etc -v /var/hpvolumes:/var/hpvolumes -v /run/containers:/run/containers -v /var/log:/var/log -e KUBECONFIG=/var/lib/microshift/resources/kubeadmin/kubeconfig $IMAGE
export KUBECONFIG=/var/lib/microshift/resources/kubeadmin/kubeconfig
We can see the microshift container running within docker:
root@nano:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8c924bf44174 karve/microshift:jetson-nano-containerized "/usr/bin/microshift…" 3 minutes ago Up 3 minutes microshift
The microshift process is running within the container:
root@nano:~# docker top microshift -o pid,cmd
PID CMD
19997 /usr/bin/microshift run
The rest of the containers run within cri-o on the host:
root@nano:~# crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME
b678938b7a6a2 3 minutes ago Ready dns-default-j7lgj openshift-dns 0 (default)
01cc8ddd857f8 3 minutes ago Ready router-default-85bcfdd948-5x6vf openshift-ingress 0 (default)
09a5cce9af718 4 minutes ago Ready kube-flannel-ds-8qn5h kube-system 0 (default)
94809dd53ee44 4 minutes ago Ready node-resolver-57xzk openshift-dns 0 (default)
4616c0c2b7151 4 minutes ago Ready service-ca-76674bfb58-bqcf8 openshift-service-ca 0 (default)
8cdd245d69c96 4 minutes ago Ready kubevirt-hostpath-provisioner-jg5pc kubevirt-hostpath-provisioner 0 (default)
Now, we can run the samples shown earlier.
After we are done, we can delete the microshift container. The --rm we used in the docker run will delete the container when we stop it.
docker stop microshift
After it is stopped, we can run the cleanup.sh as in previous section.
Conclusion
In this Part 7, we saw how to build and run MicroShift directly on the Jetson Nano with Ubuntu 20.04. We ran samples that used persistent volume for the time series database InfluxDB, GPU for inferencing, and USB camera. We worked with samples (with and without GPU) that sent the pictures and web socket messages to Node Red on IBM Cloud when a person was detected. In Part 8, we will work with MicroShift on Raspberry Pi 4 with balenaOS.
Hope you have enjoyed the article. Share your thoughts in the comments or engage in the conversation with me on Twitter @aakarve. I look forward to hearing about your use of MicroShift on ARM devices and if you would like to see something covered in more detail.
References