No version for distro humble showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro jazzy showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro kilted showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro rolling showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro galactic showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro iron showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code

No version for distro melodic showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code
Wiki

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code Wiki

No version for distro noetic showing github. Known supported distros are highlighted in the buttons above.

lightnet_trt package from lightnet-trt repo

lightnet_trt

ROS Distro
github

API Docs
Browse Code
Wiki

Package Summary

Tags	No category tags.
Version	0.0.0
License	Apache License 2.0
Build type	AMENT_CMAKE
Use	RECOMMENDED

Repository Summary

Description	LightNet-TRT is a high-efficiency and real-time implementation of convolutional neural networks (CNNs) using Edge AI.
Checkout URI	https://github.com/daniel89710/lightnet-trt.git
VCS Type	git
VCS Version	main
Last Updated	2023-10-03
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Package Description

The ROS 2 package for lightNet-TRT

Additional Links

No additional links.

Maintainers

Koji Minoda

Authors

Koji Minoda

README.md

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

LightNet-TRT is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet ^[1] and TensorRT ^[2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. LightNet-TRT uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices.

lightNet-detection-1024x256

Key Improvements

2:4 Structured Sparsity

LightNet-TRT utilizes 2:4 structured sparsity ^[3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

LightNet-TRT also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) ^[4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, LightNet-TRT can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization ^[5], LightNet-TRT also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, LightNet-TRT can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

LightNet-TRT also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

CUDA 11.0 or higher
TensorRT 8.0 or higher
OpenCV 3.0 or higher

Steps

Clone the repository.

$ git clone https://github.com/daniel89710/lightNet-TRT.git
$ cd lightNet-TRT

Install libraries.

$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev

Compile the TensorRT implementation.

$ mkdir build
$ cmake ../
$ make -j

Model

| Model | Resolutions | GFLOPS | Params | Precision | Sparsity | DNN time on RTX3080 | DNN time on Jetson Orin NX 16GB GPU | DNN time on Jetson Orin NX 16GB DLA| DNN time on Jetson Orin Nano 4GB GPU | cfg | weights | |—|—|—|—|—|—|—|—|—|—|—|—| | lightNet | 1280x960 | 58.01 | 9.0M | int8 | 49.8% | 1.30ms | 7.6ms | 14.2ms | 14.9ms | github |GoogleDrive | | LightNet+Semseg | 1280x960 | 76.61 | 9.7M | int8 | 49.8% | 2.06ms | 15.3ms | 23.2ms | 26.2ms | github | GoogleDrive| | lightNet pruning | 1280x960 | 35.03 | 4.3M | int8 | 49.8% | 1.21ms | 8.89ms | 11.68ms | 13.75ms | github |GoogleDrive | | LightNet pruning +SemsegLight | 1280x960 | 44.35 | 4.9M | int8 | 49.8% | 1.80ms | 9.89ms | 15.26ms | 23.35ms | github | GoogleDrive|

“DNN time” refers to the time measured by IProfiler during the enqueueV2 operation, excluding pre-process and post-process times.
Orin NX has three independent AI processors, allowing lightNet to be parallelized across a GPU and two DLAs.
Orin Nano 4GB has only iGPU with 512 CUDA cores.
Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kFLOAT

Build FP16(HALF) engine

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kHALF

Build INT8 engine (You need to prepare a list for calibration in “configs/calibration_images.txt”.)

$ ./lightNet-TRT --flagfile ../configs/lightNet-BDD100K-det-semaseg-1280x960.txt --precision kINT8

File truncated at 100 lines see the full file

CHANGELOG

No CHANGELOG found.

Package Dependencies

Deps	Name
	ament_cmake_auto
	autoware_cmake
	ament_cmake_ros
	ament_lint_auto
	autoware_lint_common
	sensor_msgs
	rclcpp
	cv_bridge
	image_transport
	rclcpp_components
	tier4_autoware_utils
	tf2_eigen

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

launch/lightnet_trt.launch.xml
- - input/image [default: /sensing/camera/camera0/image_rect_color]
  - input/camera_info [default: /sensing/camera/camera0/camera_info]
  - output/objects [default: /perception/object_recognition/detection/rois0]
  - model_cfg [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls.cfg]
  - model_weights [default: $(find-pkg-share lightnet_trt)/configs/yolov8x-BDD100K-640x640-T4-lane4cls-scratch-pseudo-finetune_last.weights]
  - width [default: 640]
  - height [default: 640]

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

API Docs Browse Code Wiki

lightnet_trt package from lightnet-trt repo

ROS Distrogithub

Package Summary

Repository Summary

Package Description

Additional Links

Maintainers

Authors

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

Key Improvements

2:4 Structured Sparsity

NVDLA Execution

Multi-Precision Quantization

Multitask Execution (Detection/Segmentation)

Installation

Requirements

Steps

Model

Usage

Converting a LightNet model to a TensorRT engine

Package Dependencies

System Dependencies

Dependant Packages

Launch files

Messages

Services

Plugins

Recent questions tagged lightnet_trt at Robotics Stack Exchange

lightnet_trt package from lightnet-trt repo

ROS Distrogithub

Package Summary

Repository Summary

Package Description

Additional Links

Maintainers

Authors

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

Key Improvements

2:4 Structured Sparsity

NVDLA Execution

Multi-Precision Quantization

Multitask Execution (Detection/Segmentation)

Installation

Requirements

Steps

Model

Usage

Converting a LightNet model to a TensorRT engine

Package Dependencies

System Dependencies

Dependant Packages

Launch files

Messages

Services

Plugins

Recent questions tagged lightnet_trt at Robotics Stack Exchange

lightnet_trt package from lightnet-trt repo

ROS Distrogithub

Package Summary

Repository Summary

Package Description

Additional Links

Maintainers

Authors

LightNet-TRT: High-Efficiency and Real-Time CNN Implementation on Edge AI

Key Improvements

2:4 Structured Sparsity

NVDLA Execution

Multi-Precision Quantization

Multitask Execution (Detection/Segmentation)

Installation

Requirements

Steps

Model

Usage

Converting a LightNet model to a TensorRT engine

Package Dependencies

System Dependencies

Dependant Packages

Launch files

ROS Distro
github

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

ROS Distro
github

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

ROS Distro
github

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

ROS Distro
github

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

ROS Distro
github

Recent questions tagged `lightnet_trt` at Robotics Stack Exchange

ROS Distro
github