Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |
Launch files
Messages
Services
Plugins
Recent questions tagged autoware_tensorrt_plugins at Robotics Stack Exchange
Package Summary
Tags | No category tags. |
Version | 0.47.0 |
License | Apache License 2.0 |
Build type | AMENT_CMAKE |
Use | RECOMMENDED |
Repository Summary
Description | |
Checkout URI | https://github.com/autowarefoundation/autoware_universe.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-08-16 |
Dev Status | UNKNOWN |
Released | UNRELEASED |
Tags | planner ros calibration self-driving-car autonomous-driving autonomous-vehicles ros2 3d-map autoware |
Contributing |
Help Wanted (-)
Good First Issues (-) Pull Requests to Review (-) |
Package Description
Additional Links
Maintainers
- Kenzo Lobos-Tsunekawa
- Amadeusz Szymko
- Kotaro Uetake
- Masato Saeki
Authors
autoware_tensorrt_plugins
Purpose
The autoware_tensorrt_plugins
package extends the operations available in TensorRT via plugins.
Algorithms
The following operations are implemented:
Sparse convolutions
We provide a wrapper for spconv (please see the correspondent package for details about the algorithms involved).
This requires the installation of spconv_cpp
which is automatically installed in autoware’s setup script. If needed, the user can also build and install it using the repository’s instructions.
Argsort
We provide an implementation for the Argsort
operation as a plugin since the TopK
TensorRT implementation has limitations in the size of elements it can handle.
BEV Pool
We provide a wrapper for the bev_pool
operation presented in BEVFusion. Please refer to the original paper for specific details.
Scatter operations
We provide a wrapper for the segment_csr
operation presented in torch_scatter. Please refer to the original code for specific details.
Unique
While ONNX supports the unique operation, TensorRT does not provide an implementation. For this reason we implement Unique
as CustomUnique
to avoid name classes.
The implementation mostly follows torch_scatter implementation. Please refer to the original code for specific details.
Multi-Scale Deformable Attention
The MultiScaleDeformableAttentionPlugin
implements the multi-scale deformable attention mechanism introduced in Deformable DETR. This operation is crucial for vision transformers that need to attend to multiple scales and spatial locations efficiently.
Key features:
- Supports multi-scale feature maps with different resolutions
- Enables learning of sampling offsets and attention weights
- Optimized CUDA implementation for efficient GPU execution
- Supports both FP32 and FP16 precision
Inputs:
-
value
: Feature maps at different scales (B, L, M, D) -
spatial_shapes
: Spatial dimensions of each scale (N, 2) -
level_start_index
: Starting indices for each scale (N,) -
sampling_loc
: Learned sampling locations (B, Q, M, L, P, 2) -
attn_weight
: Learned attention weights (B, Q, M, L, P)
Output:
- Attended features (B, Q, M*D)
Rotate
The RotatePlugin
provides efficient image rotation functionality with support for different interpolation methods. This is useful for data augmentation and geometric transformations in perception pipelines.
Key features:
- Supports bilinear and nearest neighbor interpolation
- Arbitrary rotation angles around a specified center point
- Optimized CUDA kernels for both FP32 and FP16 precision
- Handles boundary conditions properly
Inputs:
-
input
: Input image tensor (C, H, W) -
angle
: Rotation angle in degrees (scalar) -
center
: Center of rotation (2,)
Output:
- Rotated image with same dimensions as input
Parameters:
-
interpolation
: Interpolation mode (0 = bilinear, 1 = nearest)
Select and Pad
The SelectAndPadPlugin
enables conditional selection and padding of tensor elements based on flags. This is particularly useful for dynamic batching scenarios where sequences have variable lengths.
Key features:
- Efficiently selects valid elements based on boolean flags
- Pads output to a fixed size with invalid tokens
- Uses CUB library for optimized GPU selection operations
- Supports both FP32 and FP16 precision
Inputs:
-
feat
: Input features (B, Q, C) -
flags
: Selection flags indicating valid elements (Q,) -
invalid
: Padding value for invalid positions (C,)
Output:
- Selected and padded features (B, P, C)
File truncated at 100 lines see the full file
Changelog for package autoware_tensorrt_plugins
0.47.0 (2025-08-11)
-
feat(autoware_tensorrt_plugins): add vad trt plugins suppport (#11092)
* feat(tensorrt_plugins): add multi_scale_deformable_attention, rotate, and select_and_pad plugins Add three new TensorRT plugins to support advanced vision model operations:
- MultiScaleDeformableAttentionPlugin: Implements multi-scale deformable attention mechanism for vision transformers with CUDA kernels for efficient GPU execution
- RotatePlugin: Provides image rotation functionality with support for both bilinear and nearest neighbor interpolation modes
- SelectAndPadPlugin: Enables conditional selection and padding of tensor elements based on input flags, useful for dynamic batching scenarios Key changes:
- Migrate plugins from IPluginV2DynamicExt to IPluginV3 interface
- Add CUDA kernel implementations in separate ops subdirectories
- Update plugin registration to include new creators (count: 8 -> 11)
- Fix build issues by using SHARED libraries for CUDA ops
- Add proper namespace organization (autoware::tensorrt_plugins) The plugins are designed to integrate seamlessly with the existing Autoware TensorRT framework and support both FP32 and FP16 precision.
- refactor(tensorrt_plugins): reorganize ops directories and fix naming conventions
- Move *_ops directories from include/autoware/tensorrt_plugins to include/autoware
- Rename rotateKernel.{h,cu} to rotate_kernel.{h,cu} following snake_case convention
- Update all include paths to reflect new directory structure
- Add missing copyright header to ms_deform_attn_kernel.hpp
- Update CMakeLists.txt to reference renamed source files
- Update header guards to match new directory structure ---------
-
build: fix missing tensorrt_cmake_module dependency (#10984)
-
Contributors: Bingo, Esteve Fernandez
0.46.0 (2025-06-20)
- Merge remote-tracking branch 'upstream/main' into tmp/TaikiYamada/bump_version_base
- fix(cmake): update spconv availability messages to use STATUS and WAR… (#10690) fix(cmake): update spconv availability messages to use STATUS and WARNING
- Contributors: TaikiYamada4, Yukihiro Saito
0.45.0 (2025-05-22)
-
Merge remote-tracking branch 'origin/main' into tmp/notbot/bump_version_base
-
chore: perception code owner update (#10645)
- chore: update maintainers in multiple perception packages
* Revert "chore: update maintainers in multiple perception packages" This reverts commit f2838c33d6cd82bd032039e2a12b9cb8ba6eb584.
- chore: update maintainers in multiple perception packages
* chore: add Kok Seang Tan as maintainer in multiple perception packages ---------
-
chore(autoware_tensorrt_plugins): update maintainer (#10627)
- chore(autoware_tensorrt_plugins): update maintainer
* chore(autoware_tensorrt_plugins): update maintainer ---------
-
Contributors: Amadeusz Szymko, Taekjin LEE, TaikiYamada4
0.44.2 (2025-06-10)
0.44.1 (2025-05-01)
0.44.0 (2025-04-18)
-
chore: match all package versions
-
Merge remote-tracking branch 'origin/main' into humble
File truncated at 100 lines see the full file
Package Dependencies
Deps | Name |
---|---|
ament_cmake_auto | |
autoware_cmake | |
tensorrt_cmake_module | |
ament_cmake_ros | |
ament_lint_auto | |
autoware_lint_common | |
autoware_cuda_dependency_meta | |
autoware_cuda_utils |
System Dependencies
Dependant Packages
Name | Deps |
---|---|
autoware_bevfusion | |
autoware_diffusion_planner |