Abstract
Integrating the Artificial Intelligence (AI) vision module into the robot grasping system can significantly improve its generalizability, thereby enhancing the efficiency of Human-Robot Interaction (HRI). However, the inherent lack of interpretability in AI also opens the gate to external threats. In this work, we reveal a novel safety risk in this vision-guided robot grasping system by proposing the Shortcut-enhanced Multimodal Backdoor Attack (SEMBA), which can manipulate the grasp quality score using the backdoor trigger leading to a misguided grasping sequence. The SEMBA may thus cause potentially hazardous grasping and pose a threat to human safety in HRI. Specifically, we initially present the Multimodal Shortcut Searching Algorithm (MSSA) to find the pixel value that deviates the most from the mean and standard deviation of the multimodal dataset, along with the pivotal pixel position for individual images. This will guarantee that the proposed attack is effective in complex, multi-class object scenarios. Next, based on MSSA, we devise the Multimodal Trigger Generator (MTG) to create diverse multimodal backdoor triggers and integrate them into the dataset, ensuring that our attack has the multimodality attribute. We conduct extensive experiments on the benchmark datasets and a cobot, showing the effectiveness of the proposed method both in the digital and physical worlds.
Highlights
This work presents a novel backdoor attack that exposes previously unexplored safety risks in vision-guided robot grasping. To the best of our knowledge, this is the first study to introduce backdoor attacks in robot grasping systems, demonstrating how maliciously manipulated visual grasping models can alter grasping sequences and induce hazardous behaviors.
Results - Attacking Grasp Detection
Results - Attacking Robot Grasping
BibTeX
@article{li2025semba,
title={Shortcut-Enhanced Multimodal Backdoor Attack in Vision-guided Robot Grasping},
author={Chenghao Li, Ziyan Gao, and Nak Young Chong},
journal={IEEE Transactions on Automation Science and Engineering},
year={2025}
}