Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Large language models (LLMs) exhibit poor stability, slow convergence, and hallucinated actions in language-guided spatial manipulation tasks. Method: We propose a neuro-symbolic collaborative control framework: a lightweight LLM (e.g., Mistral, Phi, or LLaMA-3.2) performs high-level semantic parsing and symbolic decision-making, generating strictly verifiable discrete commands; a compact neural delta controller executes constrained, continuous incremental motion. Contribution/Results: This is the first work to explicitly decouple semantic reasoning from motor control—without reinforcement learning or trial-and-error training—relying solely on synthetic geometric data and symbolic spatial relation modeling. Experiments show substantial improvements over end-to-end LLM control: average task success rate increases significantly, planning steps decrease by over 70%, inference speed accelerates up to 8.83×, and the framework maintains strong robustness even with low-quality LLMs.

Technology Category

Application Category

📝 Abstract

Although large language models (LLMs) have recently become effective tools for language-conditioned control in embodied systems, instability, slow convergence, and hallucinated actions continue to limit their direct application to continuous control. A modular neuro-symbolic control framework that clearly distinguishes between low-level motion execution and high-level semantic reasoning is proposed in this work. While a lightweight neural delta controller performs bounded, incremental actions in continuous space, a locally deployed LLM interprets symbolic tasks. We assess the suggested method in a planar manipulation setting with spatial relations between objects specified by language. Numerous tasks and local language models, such as Mistral, Phi, and LLaMA-3.2, are used in extensive experiments to compare LLM-only control, neural-only control, and the suggested LLM+DL framework. In comparison to LLM-only baselines, the results show that the neuro-symbolic integration consistently increases both success rate and efficiency, achieving average step reductions exceeding 70% and speedups of up to 8.83x while remaining robust to language model quality. The suggested framework enhances interpretability, stability, and generalization without any need of reinforcement learning or costly rollouts by controlling the LLM to symbolic outputs and allocating uninterpreted execution to a neural controller trained on artificial geometric data. These outputs show empirically that neuro-symbolic decomposition offers a scalable and principled way to integrate language understanding with ongoing control, this approach promotes the creation of dependable and effective language-guided embodied systems.

Problem

Research questions and friction points this paper is trying to address.

Integrates language models with control for spatial tasks.

Addresses instability and inefficiency in continuous control.

Enhances interpretability and generalization without reinforcement learning.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular neuro-symbolic framework separates reasoning from execution

LLM interprets symbolic tasks while neural controller handles continuous actions

Integration boosts success rates and efficiency without reinforcement learning

🔎 Similar Papers

Neuro-symbolic Training for Reasoning over Spatial Language