techno_utopian | Robotics Portfolio

Overview

This ongoing humanoid robotics research framework investigates how curiosity, purpose-biased intrinsic motivation, and human feedback (“bonding” signals) can enable a humanoid robot to adapt, refine, and align its behavior within dynamic industrial environments.

Unlike scripted control pipelines, this humanoid explores and learns through novelty, human interaction cues, and an internal behavioral trait vector that shapes how it perceives, explores, and collaborates.

This framework forms the foundation of my Master’s thesis:

“Curiosity and Bonding Driven Reinforcement Learning for Adaptive Humanoid Collaboration in Industrial Environments.”

Core Idea

Industrial humanoids must handle:

Constantly shifting workspaces
Unpredictable human movements
Ambiguous human cues
Exploration without risking safety
Trust building through behavior

This research integrates a four-part adaptive RL system:

1. Intrinsic Curiosity (ICM/RND)

Internal rewards for surprising or informative transitions, enabling exploration beyond hard coded behaviors.

2. Purpose-Biased Motivation Prior

A task domain bias that aligns curiosity with industrial affordances tools, equipment, workspace geometry, rather than random actions.

3. Bonding Engine (Human Feedback)

Human gestures, voice cues, confirmations, or corrections are transformed into social reward shaping, promoting intuitive collaboration.

4. Inherited Trait Vector (ITV)

A compact behavioral profile that influences risk tolerance, approach distance, proactivity, and exploration style.

Together, these allow the humanoid to:

Explore more intelligently
Adapt RL policies through human cues
Maintain safe behavior
Refine industrial task skills over time
Synchronize with human collaborators

System Architecture

Hardware & Simulation Bodies

This framework spans three humanoid platforms:

Poppy Humanoid (open-source, modular research body)
https://www.poppy-project.org/en/robots/poppy-humanoid
Unitree H1 / H1-Lite (industrial humanoid reference)
https://www.unitree.com/mobile/h1/
Custom lightweight humanoid prototype

This enables scalable testing across locomotion, manipulation, perception, and social feedback conditions.

Software Stack

ROS 2 Humble – distributed control and modular nodes
PyTorch – RL agents, curiosity models, and ITV learning
Gazebo / MuJoCo / Isaac sim – full humanoid simulation
OpenCV – saliency, stereo depth, gesture cues
Audio models – speech & cue detection for bonding signals
C++/Python ROS2 nodes – perception, reward shaping, control graphs

Subsystems Developed

Curiosity Engine

ICM/RND with industrial-purpose weighting to avoid meaningless exploration.

Bonding Engine

Gesture/voice processing → reward pulses for human-aligned decision shaping.

Inherited Trait Vector (ITV)

A trait layer driving risk posture, exploration, proactivity, and cooperation tendencies.

Perception Layer

Stereo depth
Attention/saliency
Workspace monitoring
Gesture recognition
Voice/intent detection

Control & Behavior Layer

Action primitives
Safety-first control graphs
Policy-conditioned motion generation
Interpretable behavior models

Current State of Research

Completed:

Curiosity–bonding–ITV RL architecture
Saliency + stereo depth perception stack
Gesture/voice feedback reward injection
Early RL testing through Poppy & Unitree models
Modular CAD for custom humanoid prototype
ROS2 ↔ PyTorch policy interface

Next phase: closed-loop RL experiments integrating curiosity, bonding, and ITV for task acquisition.

Why This Research Matters

Industrial humanoids need:

Autonomy
Adaptive behavior
Perception-driven intelligence
Safe exploration
Trust-building

This framework demonstrates how RL, curiosity, perception, and human feedback can unify into a single architecture for future industrial humanoid coworkers.

It integrates cognitive robotics, social signal processing, and reinforcement learning into a practical, scalable framework.

References

Du, Y. et al., 2025. Learning Human-Humanoid Coordination for Collaborative Object Carrying.
https://arxiv.org/abs/2510.14293
Kerzel, M. et al., 2023. NICOL: A Neuro-inspired Collaborative Semi-humanoid Robot.
https://arxiv.org/abs/2305.08528
Puig, X. et al., 2023. Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots.
https://arxiv.org/abs/2310.13724
Pathak, D. et al., 2017. Curiosity-driven Exploration by Self-supervised Prediction.
https://arxiv.org/abs/1705.05363