Team
This project was developed collaboratively as part of ROS2 Case Study
at Deggendorf Institute of Technology.
Team Members:
- Ruchit Bhanushali — ROS2 Integration, Motion Planning, MoveIt2, Simulation
- Kaung Sett Thu — RealSense Calibration, Perception, Dataset & YOLOv8 Model
- Siddharth Ahuja — Calibration & Localization
- Sahil Gore
Overview
A ROS2-based perception and manipulation pipeline built at Deggendorf Institute of Technology as part of the ROS2 Case Study module. The system integrates a UR3e collaborative arm, Intel RealSense D435i RGB-D camera, YOLOv8 object detection, MoveIt2 motion planning, and a Robotiq 2F-140 gripper to autonomously sort carrots, tomatoes, and potatoes.
The pipeline runs fully end-to-end: RGB-D capture, YOLOv8 detection, depth-based 3D localization, grasp pose computation, collision-aware MoveIt2 trajectory planning, and UR3e execution - all connected as a live ROS2 Humble system.
The YOLOv8n model was trained on a 683-image Roboflow dataset across 20 epochs with image size 640 and batch size 16, achieving 94.6% precision and 96.1% recall. The MoveIt2 pipeline was validated in simulation with stable pick-and-place trajectories, a correct TF chain from world to base_link to tool0 to gripper, and collision-aware Cartesian motion planning between pre-grasp, grasp, lift, pre-place, and place waypoints.
Hardware Setup
UR3e Collaborative Arm
- 6-DOF, force-controlled, reliable for pick-and-place.
Robotiq 2F-140 Gripper
- Adaptive parallel motion, 140 mm opening for variable vegetable sizes.
Intel RealSense D435i
- RGB-D sensing + IMU for real-time depth & pose estimation.
ROS2 Workstation
- Runs YOLOv8 inference, MoveIt2 planning, TF frames, and control nodes.
Software Stack
- ROS2 Humble — Core middleware
- YOLOv8 — Real-time vegetable detection
- MoveIt2 — Grasp generation + trajectory planning
- RViz2 — Visualization & simulation
Engineering Challenges
Building this system surface several non-trivial integration problems. MoveIt2's default planners produced valid but visually unnatural trajectories unsuitable for real robot execution - this required a full switch to waypoint-based Cartesian pose planning. Collision object management had to be redesigned because adding the target object as a collision object caused planning failures during the grasp phase. The Robotiq gripper TCP offset was not correctly defined in available URDF references and had to be manually calculated and validated. Controller conflicts between moveit_controllers and ros2_controllers from different packages caused repeated launch failures until the configuration was rebuilt from scratch. Launch ordering across ros2_control, MoveIt, and Gazebo required explicit dependency management to achieve consistent startup behavior.
Data Flow Pipeline
- RGB-D Capture (RealSense D435i)
- YOLOv8 Detection
- Depth Cloud Cropping
- Grasp Pose Computation
- Pick-and-Place Planning
- UR3e Execution
This creates a fully automated loop from perception to actuation.
Dataset
- Public Roboflow dataset
- 683 images
- Augmented for model robustness
- Classes: carrot, tomato, potato
What Was Built and Delivered
The perception pipeline was fully completed and validated. The YOLOv8n model achieved 94.6% precision and 96.1% recall across all three vegetable classes. The camera ROS2 node publishes four topics in real time: detected object class, 3D position, object dimensions, and annotated YOLO image. Depth processing uses ROI extraction from the RealSense D435i depth stream, filters invalid zero-depth values, and converts bounding box pixels to 3D metric coordinates using camera intrinsics.
The manipulation pipeline was fully set up and simulation-validated. The UR3e arm with Robotiq 2F-140 gripper was integrated into a unified URDF/Xacro model. MoveIt2 was configured with a stable TF tree, correct TCP offset for the Robotiq gripper, and ros2_control for arm and gripper operation. Pose-based Cartesian planning was implemented to avoid joint-angle dependency and ensure real-robot compatibility. Gripper partial closure values were manually calibrated to prevent over-closing during grasp.
The full end-to-end pipeline - from YOLOv8 detection through depth localization to MoveIt2 pick-and-place execution - was integrated and validated in simulation with consistent, repeatable trajectories across all three vegetable classes.
Real-world deployment on the physical UR3e arm is the next step, with the simulation-to-real transfer designed to require minimal changes to the existing pipeline.
References
-
Spanu, A. et al., 2023. Vision-Based Robotic Sorting System for Agricultural Products. Politecnico di Torino.
https://webthesis.biblio.polito.it/33164/ -
Iftikhar, M. et al., 2024. Computer Vision as a Tool to Support Quality Control and Robotic Handling of Fruit: A Case Study.
https://www.researchgate.net/publication/385205971 -
Wu, Q. et al., 2023. Vegetable Disease Detection Using an Improved YOLOv8 Algorithm in Greenhouse Plant Environments.
https://www.researchgate.net/publication/378371592




