llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro humble. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro jazzy. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro kilted. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro rolling. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp llama_bringup llama_bt llama_cli llama_cpp_vendor llama_demos llama_hfhub_vendor llama_msgs llama_ros

Repository Summary

Description	llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Checkout URI	https://github.com/mgonzs13/llama_ros.git
VCS Type	git
VCS Version	main
Last Updated	2025-10-06
Dev Status	UNKNOWN
Released	UNRELEASED
Tags	audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Contributing	Help Wanted (-) Good First Issues (-) Pull Requests to Review (-)

Packages

Name	Version
llama_bringup	5.3.3
llama_bt	1.0.0
llama_cli	5.3.3
llama_cpp_vendor	5.3.3
llama_demos	5.3.3
llama_hfhub_vendor	5.3.3
llama_msgs	5.3.3
llama_ros	5.3.3

README

llama_ros

This repository provides a set of ROS 2 packages to integrate llama.cpp into ROS 2. Using the llama_ros packages, you can easily incorporate the powerful optimization capabilities of llama.cpp into your ROS 2 projects by running GGUF-based LLMs and VLMs. You can also use features from llama.cpp such as GBNF grammars and modify LoRAs in real-time.

[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/llama_ros.svg?branch=main)](https://github.com/mgonzs13/llama_ros?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/llama_ros)](https://github.com/mgonzs13/llama_ros/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/llama_ros)](https://github.com/mgonzs13/llama_ros/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/graphs/contributors) [![Python Formatter Check](https://github.com/mgonzs13/llama_ros/actions/workflows/python-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/python-formatter.yml?branch=main) [![C++ Formatter Check](https://github.com/mgonzs13/llama_ros/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/cpp-formatter.yml?branch=main) | ROS 2 Distro | Branch | Build status | Docker Image | Documentation | | :----------: | :-------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Humble** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Humble Build](https://github.com/mgonzs13/llama_ros/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/humble-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=humble) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Iron** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Iron Build](https://github.com/mgonzs13/llama_ros/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/iron-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=iron) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Jazzy** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Jazzy Build](https://github.com/mgonzs13/llama_ros/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/jazzy-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=jazzy) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Kilted** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Kilted Build](https://github.com/mgonzs13/llama_ros/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/kilted-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=kilted) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Rolling** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Rolling Build](https://github.com/mgonzs13/llama_ros/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/rolling-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=rolling) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) |

chatbot_ros → This chatbot, integrated into ROS 2, uses whisper_ros, to listen to people speech; and llama_ros, to generate responses. The chatbot is controlled by a state machine created with YASMIN.
explainable_ros → A ROS 2 tool to explain the behavior of a robot. Using the integration of LangChain, logs are stored in a vector database. Then, RAG is applied to retrieve relevant logs for user questions answered with llama_ros.

Installation

To run llama_ros with CUDA, first, you must install the CUDA Toolkit. Then, you can compile llama_ros with --cmake-args -DGGML_CUDA=ON to enable CUDA support.

cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/llama_ros.git
pip3 install -r llama_ros/requirements.txt
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build --cmake-args -DGGML_CUDA=ON # add this for CUDA

Docker

Build the llama_ros docker or download an image from DockerHub. You can choose to build llama_ros with CUDA (USE_CUDA) and choose the CUDA version (CUDA_VERSION). Remember that you have to use DOCKER_BUILDKIT=0 to compile llama_ros with CUDA when building the image.

DOCKER_BUILDKIT=0 docker build -t llama_ros --build-arg USE_CUDA=1 --build-arg CUDA_VERSION=12-6 .

Run the docker container. If you want to use CUDA, you have to install the NVIDIA Container Tollkit and add --gpus all.

docker run -it --rm --gpus all llama_ros

Usage

llama_cli

Commands are included in llama_ros to speed up the test of GGUF-based LLMs within the ROS 2 ecosystem. This way, the following commands are integrating into the ROS 2 commands:

launch

Using this command launch a LLM from a YAML file. The configuration of the YAML is used to launch the LLM in the same way as using a regular launch file. Here is an example of how to use it:

ros2 llama launch ~/ros2_ws/src/llama_ros/llama_bringup/models/StableLM-Zephyr.yaml

prompt

Using this command send a prompt to a launched LLM. The command uses a string, which is the prompt and has the following arguments:

(-r, --reset): Whether to reset the LLM before prompting
(-t, --temp): The temperature value
(--image-url): Image url to sent to a VLM

Here is an example of how to use it:

ros2 llama prompt "Do you know ROS 2?" -t 0.0

Launch Files

First of all, you need to create a launch file to use llama_ros or llava_ros. This launch file will contain the main parameters to download the model from HuggingFace and configure it. Take a look at the following examples and the predefined launch files.

llama_ros (Python Launch)

File truncated at 100 lines [see the full file](https://github.com/mgonzs13/llama_ros/tree/main/README.md)

CONTRIBUTING

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro galactic. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro iron. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro melodic. Known supported distros are highlighted in the buttons above.

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp

No version for distro noetic. Known supported distros are highlighted in the buttons above.

llama_ros repository

llama_ros repository

llama_ros repository

llama_ros repository

llama_ros repository

Repository Summary

Packages

README

llama_ros

Table of Contents

Related Projects

Installation

Docker

Usage

llama_cli

launch

prompt

Launch Files

llama_ros (Python Launch)

CONTRIBUTING

llama_ros repository

llama_ros repository

llama_ros repository

llama_ros repository