TeNet: Text-to-Network for Compact Policy Synthesis

📅 2026-01-22

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenge of enabling robots to follow natural language instructions efficiently, where existing approaches either rely on handcrafted interfaces or employ large end-to-end models that are impractical for real-time deployment. The authors propose a text-conditioned hypernetwork framework that leverages textual embeddings from a pretrained large language model to generate lightweight, task-specific low-dimensional controllers in a single step during policy instantiation. Language input is used only at policy generation time, and behavioral alignment training enhances generalization without requiring demonstrations during inference. Evaluated in multi-task and meta-learning settings on MuJoCo and Meta-World benchmarks, the resulting policies are orders of magnitude smaller than sequential-model baselines, enabling high-frequency real-time control while effectively balancing linguistic understanding and execution efficiency.

Technology Category

Application Category

📝 Abstract

Robots that follow natural-language instructions often either plan at a high level using hand-designed interfaces or rely on large end-to-end models that are difficult to deploy for real-time control. We propose TeNet (Text-to-Network), a framework for instantiating compact, task-specific robot policies directly from natural language descriptions. TeNet conditions a hypernetwork on text embeddings produced by a pretrained large language model (LLM) to generate a fully executable policy, which then operates solely on low-dimensional state inputs at high control frequencies. By using the language only once at the policy instantiation time, TeNet inherits the general knowledge and paraphrasing robustness of pretrained LLMs while remaining lightweight and efficient at execution time. To improve generalization, we optionally ground language in behavior during training by aligning text embeddings with demonstrated actions, while requiring no demonstrations at inference time. Experiments on MuJoCo and Meta-World benchmarks show that TeNet produces policies that are orders of magnitude smaller than sequence-based baselines, while achieving strong performance in both multi-task and meta-learning settings and supporting high-frequency control. These results show that text-conditioned hypernetworks offer a practical way to build compact, language-driven controllers for ressource-constrained robot control tasks with real-time requirements.

Problem

Research questions and friction points this paper is trying to address.

robot policy

natural language instruction

compact policy

real-time control

language-conditioned control

Innovation

Methods, ideas, or system contributions that make the work stand out.

hypernetwork

language-conditioned policy

compact robot control