Humans are known to accumulate knowledge over time, which in turn allows them to continuously improve their abilities and skills. This capability, known as lifelong learning, has so far proved difficult to replicate in artificial intelligence (AI) and robotics systems.
A research team at Technical University of Munich and Nanjing University, led by Prof. Alois Knoll and Dr. Zhenshan Bing, has developed LEGION, a new reinforcement learning framework that could equip robotic systems with lifelong learning capabilities.
Their proposed framework, presented in a paper in Nature Machine Intelligence, could help to enhance the adaptability of robots, while also improving their performance in real-world settings.
“Our research originated from a project on robotic meta-reinforcement learning in 2021, where we initially explored Gaussian mixture models (GMM) as priors for task inference and knowledge clustering,” Yuan Meng, first author of the paper, told Tech Xplore.
“While this approach yielded promising results, we encountered a limitationβGMMs require a predefined number of clusters, making them unsuitable for lifelong learning scenarios where the number of tasks is inherently unknown and evolves asynchronously.
“To address this, we turned to Bayesian non-parametric models, specifically Dirichlet Process Mixture Models (DPMMs), which can dynamically adjust the number of clusters based on incoming task data.”
Leveraging a class of models known as DPMMs, the LEGION framework allows algorithms trained via reinforcement learning to continuously acquire, preserve and re-apply knowledge across a changing stream of tasks. The researchers hope that this new framework will help to enhance the learning abilities of AI agents, bringing them one step closer to the lifelong learning observed in humans.
“The LEGION framework is designed to mimic human lifelong learning by allowing a robot to continuously learn new tasks while preserving and reusing previously acquired knowledge,” explained Meng.
“Its key contribution is a non-parametric knowledge space based on a DPMM, which dynamically determines how knowledge is structured without requiring a predefined number of task clusters. This prevents catastrophic forgetting and allows flexible adaptation to new, unseen tasks.”
The new framework introduced by Meng, Prof. Knoll, Dr. Bing and their colleagues integrates language embeddings that are encoded from a pre-trained large language model (LLM). This integration ultimately allows robots to process and understand a user’s instructions, interpreting these instructions independently from task demonstrations.
“Furthermore, our framework facilitates knowledge recombination, meaning a robot can solve long-horizon tasksβsuch as cleaning a tableβby intelligently sequencing previously learned skills like pushing objects, opening drawers, or pressing buttons,” said Meng.
“Unlike conventional imitation learning, which relies on predefined execution sequences, LEGION allows for flexible skill combination in any required order, leading to greater generalization and flexibility in real-world robotic applications.”
The researchers evaluated their approach in a series of initial tests, applying it to a real robotic system. Their findings were very promising, as the LEGION framework allowed the robot to consistently accumulate knowledge from a continuous stream of tasks.
“We demonstrated that non-parametric Bayesian models, specifically DPMM, can serve as effective prior knowledge for robotic lifelong learning,” said Meng. “Unlike traditional multi-task learning, where all tasks are learned simultaneously, our framework can dynamically adapt to an unknown number task stream, preserving and recombining knowledge to improve performance over time.”
The recent work by Meng, Prof. Knoll, Dr. Bing and their colleagues could inform future efforts aimed at developing robots that can continuously acquire knowledge and refine their skills over time. The LEGION framework could be improved further and applied to a wide range of robots, including service robots and industrial robots.
“For example, a robot deployed in a home environment could learn household chores over time, refining its skills based on user feedback and adapting to new tasks as they arise,” said Meng. “Similarly, in industrial settings, robots could incrementally learn and adapt to changing production lines without requiring extensive reprogramming.”
In their next studies, the researchers plan to work on further enhancing the stability vs. plasticity trade-off in lifelong learning, as this would allow robots to reliably retain knowledge over time, while also adapting to new environments or tasks. To do this, they will integrate various computational techniques, including generative replay and continual backpropagation.
“Another key direction for future research will be cross-platform knowledge transfer, where a robot can transfer and adapt learned knowledge across different embodiments, such as humanoid robots, robotic arms, and mobile platforms,” added Meng.
“We also seek to expand LEGION’s capabilities beyond structured environments, allowing robots to handle unstructured, dynamic real-world settings with diverse object arrangements. Finally, we envision leveraging LLMs for real-time reward adaptation, enabling robots to refine their task objectives dynamically based on verbal or contextual feedback.”
Β© 2025 Science X Network