No version for distro humble showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro jazzy showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro kilted showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro rolling showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro github showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro galactic showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

No version for distro iron showing melodic. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange

Package Summary

Tags No category tags.
Version 2.1.31
License BSD
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description
Checkout URI https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type git
VCS Version master
Last Updated 2025-10-22
Dev Status DEVELOPED
Released RELEASED
Tags No category tags.
Contributing Help Wanted (-)
Good First Issues (-)
Pull Requests to Review (-)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Maintainers

  • Yuki Furuta

Authors

  • Yuki Furuta

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

  1. Install this package and SpeechReconition
  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  
  1. Launch speech recognition node
  roslaunch ros_speech_recognition speech_recognition.launch
  
  1. Echo /speech_to_text
  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

speech_recognition_node.py Interface

Publishing Topics

  • ~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

    Speech recognition candidates topic name.

    Topic name is set by parameter ~voice_topic, and default value is speech_to_text.

  • sound_play (sound_play/SoundRequestAction)

    Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

  • ~audio_topic (audio_common_msgs/AudioData)

    Audio stream data to be recognized.

    Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

  • speech_recognition (speech_recognition_msgs/SpeechRecognition)

    Service for speech recognition

  • speech_recognition/start (std_srvs/Empty)

    Start service for speech recognition

    This service is available when parameter ~contiunous is True.

  • speech_recognition/start (std_srvs/Empty)

    Stop service for speech recognition

    This service is available when parameter ~contiunous is True.

Parameters

  • ~voice_topic (String, default: speech_to_text)

    Publishing voice topic name

  • ~audio_topic (String, default: audio)

    Subscribing audio topic name

  • ~enable_sound_effect (Bool, default: True)

    Flag to enable or disable sound to play sound on recognition.

  • ~language (String, default: en-US)

    Language to be recognized

  • ~engine (Enum[String], default: Google)

    Speech-to-text engine (To see full options use dynamic_reconfigure)

File truncated at 100 lines see the full file

CHANGELOG

Changelog for package ros_speech_recognition

2.1.31 (2025-05-13)

2.1.30 (2025-05-10)

2.1.29 (2025-01-05)

  • [doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
  • Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

  • fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
  • Contributors: Kei Okada

2.1.26 (2023-06-14)

  • add LICENSE files (#476)
  • Contributors: Kei Okada

2.1.25 (2023-06-08)

  • [ros_speech_recognition] Add vosk engine (#474)
  • Pr/use sound themes freedesktop (#472)
  • add test to check if ros node is loadable (#463)
  • add self.conf_thresh in __init_ function (#457)
  • [ros_speech_recognition] add ubuntu-sounds dependency (#453)
  • [ros_speech_recognition] Return if result is empty (#443)
  • [ros_speece_recognition] Set confidence value of google (#434)
  • [ros_speech_recognition] add parrotry.launch (#414)
  • [ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
  • [ros_speech_recogniton, respeaker_ros] add confidence field (#411)
  • [ros_speech_recognition] add self cancellation for speech recogntion (#413)
  • [#405 and #410] Fix CI (#415)
  • add ROS interface for https://cloud.google.com/natural-language (#304)
  • GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
    • pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
  • Explicit python interpreter in catkin_virtualenv (#367)
  • .github/workflow: integrate all yaml to one (#338)
  • [ros_speech_recognition] Fixed the behavior of launch file (#336)
  • [ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
  • [ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
  • Enable sound play flag (#315)
  • Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

  • enable to change topic name from speech_recognition.launch (#254)
  • support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
    • [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
    • more exception message for self.recognize
  • Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
  • Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

File truncated at 100 lines see the full file

Launch files

  • launch/parrotry.launch
      • use_google [default: true]
      • language [default: en-US]
      • confidence_threshold [default: 0.8]
  • launch/speech_recognition.launch
      • launch_sound_play [default: true] — Launch sound_play node to speak
      • launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
      • audio_topic [default: /audio] — Name of audio topic captured from microphone
      • voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
      • n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
      • device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
      • engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
      • language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
      • continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
      • auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
      • self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
      • tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
      • tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged ros_speech_recognition at Robotics Stack Exchange