Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp llama_bringup llama_bt llama_cli llama_cpp_vendor llama_demos llama_hfhub_vendor llama_msgs llama_ros

Repository Summary

Description llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2
Checkout URI https://github.com/mgonzs13/llama_ros.git
VCS Type git
VCS Version main
Last Updated 2025-07-20
Dev Status UNKNOWN
Released UNRELEASED
Tags audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Packages

Name Version
llama_bringup 5.3.1
llama_bt 1.0.0
llama_cli 5.3.1
llama_cpp_vendor 5.3.1
llama_demos 5.3.1
llama_hfhub_vendor 5.3.1
llama_msgs 5.3.1
llama_ros 5.3.1

README

llama_ros

This repository provides a set of ROS 2 packages to integrate llama.cpp into ROS 2. Using the llama_ros packages, you can easily incorporate the powerful optimization capabilities of llama.cpp into your ROS 2 projects by running GGUF-based LLMs and VLMs. You can also use features from llama.cpp such as GBNF grammars and modify LoRAs in real-time.

[![License: MIT](https://img.shields.io/badge/GitHub-MIT-informational)](https://opensource.org/license/mit) [![GitHub release](https://img.shields.io/github/release/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/releases) [![Code Size](https://img.shields.io/github/languages/code-size/mgonzs13/llama_ros.svg?branch=main)](https://github.com/mgonzs13/llama_ros?branch=main) [![Last Commit](https://img.shields.io/github/last-commit/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/commits/main) [![GitHub issues](https://img.shields.io/github/issues/mgonzs13/llama_ros)](https://github.com/mgonzs13/llama_ros/issues) [![GitHub pull requests](https://img.shields.io/github/issues-pr/mgonzs13/llama_ros)](https://github.com/mgonzs13/llama_ros/pulls) [![Contributors](https://img.shields.io/github/contributors/mgonzs13/llama_ros.svg)](https://github.com/mgonzs13/llama_ros/graphs/contributors) [![Python Formatter Check](https://github.com/mgonzs13/llama_ros/actions/workflows/python-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/python-formatter.yml?branch=main) [![C++ Formatter Check](https://github.com/mgonzs13/llama_ros/actions/workflows/cpp-formatter.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/cpp-formatter.yml?branch=main) | ROS 2 Distro | Branch | Build status | Docker Image | Documentation | | :----------: | :-------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------: | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Humble** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Humble Build](https://github.com/mgonzs13/llama_ros/actions/workflows/humble-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/humble-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-humble-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=humble) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Iron** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Iron Build](https://github.com/mgonzs13/llama_ros/actions/workflows/iron-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/iron-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-iron-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=iron) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Jazzy** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Jazzy Build](https://github.com/mgonzs13/llama_ros/actions/workflows/jazzy-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/jazzy-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-jazzy-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=jazzy) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Kilted** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Kilted Build](https://github.com/mgonzs13/llama_ros/actions/workflows/kilted-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/kilted-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-kilted-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=kilted) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) | | **Rolling** | [`main`](https://github.com/mgonzs13/llama_ros/tree/main) | [![Rolling Build](https://github.com/mgonzs13/llama_ros/actions/workflows/rolling-docker-build.yml/badge.svg?branch=main)](https://github.com/mgonzs13/llama_ros/actions/workflows/rolling-docker-build.yml?branch=main) | [![Docker Image](https://img.shields.io/badge/Docker%20Image%20-rolling-blue)](https://hub.docker.com/r/mgons/llama_ros/tags?name=rolling) | [![Doxygen Deployment](https://github.com/mgonzs13/llama_ros/actions/workflows/doxygen-deployment.yml/badge.svg)](https://mgonzs13.github.io/llama_ros/latest) |

Table of Contents

  1. Related Projects
  2. Installation
  3. Docker
  4. Usage
  5. Demos
  • chatbot_ros → This chatbot, integrated into ROS 2, uses whisper_ros, to listen to people speech; and llama_ros, to generate responses. The chatbot is controlled by a state machine created with YASMIN.
  • explainable_ros → A ROS 2 tool to explain the behavior of a robot. Using the integration of LangChain, logs are stored in a vector database. Then, RAG is applied to retrieve relevant logs for user questions answered with llama_ros.

Installation

To run llama_ros with CUDA, first, you must install the CUDA Toolkit. Then, you can compile llama_ros with --cmake-args -DGGML_CUDA=ON to enable CUDA support.

cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/llama_ros.git
pip3 install -r llama_ros/requirements.txt
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build --cmake-args -DGGML_CUDA=ON # add this for CUDA

Docker

Build the llama_ros docker or download an image from DockerHub. You can choose to build llama_ros with CUDA (USE_CUDA) and choose the CUDA version (CUDA_VERSION). Remember that you have to use DOCKER_BUILDKIT=0 to compile llama_ros with CUDA when building the image.

DOCKER_BUILDKIT=0 docker build -t llama_ros --build-arg USE_CUDA=1 --build-arg CUDA_VERSION=12-6 .

Run the docker container. If you want to use CUDA, you have to install the NVIDIA Container Tollkit and add --gpus all.

docker run -it --rm --gpus all llama_ros

Usage

llama_cli

Commands are included in llama_ros to speed up the test of GGUF-based LLMs within the ROS 2 ecosystem. This way, the following commands are integrating into the ROS 2 commands:

launch

Using this command launch a LLM from a YAML file. The configuration of the YAML is used to launch the LLM in the same way as using a regular launch file. Here is an example of how to use it:

ros2 llama launch ~/ros2_ws/src/llama_ros/llama_bringup/models/StableLM-Zephyr.yaml

prompt

Using this command send a prompt to a launched LLM. The command uses a string, which is the prompt and has the following arguments:

  • (-r, --reset): Whether to reset the LLM before prompting
  • (-t, --temp): The temperature value
  • (--image-url): Image url to sent to a VLM

Here is an example of how to use it:

ros2 llama prompt "Do you know ROS 2?" -t 0.0

Launch Files

First of all, you need to create a launch file to use llama_ros or llava_ros. This launch file will contain the main parameters to download the model from HuggingFace and configure it. Take a look at the following examples and the predefined launch files.

llama_ros (Python Launch)

File truncated at 100 lines [see the full file](https://github.com/mgonzs13/llama_ros/tree/main/README.md)
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp
Repo symbol

llama_ros repository

audio cpp embeddings llama gpt ros2 vlm reranking multimodal llm langchain llava llamacpp ggml gguf rerank llavacpp