🤖 AI Summary
Current signal processing (SP) pipelines suffer from heavy reliance on expert knowledge, poor generalizability, and low data efficiency. To address these limitations, this paper introduces the first large language model (LLM)-based agent framework tailored for general-purpose signal processing tasks. The framework features a modular architecture and hierarchical planning mechanism, integrating in-context learning, domain-specific retrieval, cross-modal reasoning, code generation, and retrieval-augmented generation (RAG) to automatically decompose high-level objectives into executable subtasks and adaptively execute them. Crucially, it deeply embeds LLMs into the end-to-end SP workflow—enabling robust performance in few-shot and zero-shot settings, where conventional methods falter. Extensive evaluation across five diverse tasks—including radar target detection and human activity recognition—demonstrates substantial improvements over state-of-the-art SP approaches and LLM baselines, validating the framework’s strong generalization capability and data efficiency.
📝 Abstract
Modern signal processing (SP) pipelines, whether model-based or data-driven, often constrained by complex and fragmented workflow, rely heavily on expert knowledge and manual engineering, and struggle with adaptability and generalization under limited data. In contrast, Large Language Models (LLMs) offer strong reasoning capabilities, broad general-purpose knowledge, in-context learning, and cross-modal transfer abilities, positioning them as powerful tools for automating and generalizing SP workflows. Motivated by these potentials, we introduce SignalLLM, the first general-purpose LLM-based agent framework for general SP tasks. Unlike prior LLM-based SP approaches that are limited to narrow applications or tricky prompting, SignalLLM introduces a principled, modular architecture. It decomposes high-level SP goals into structured subtasks via in-context learning and domain-specific retrieval, followed by hierarchical planning through adaptive retrieval-augmented generation (RAG) and refinement; these subtasks are then executed through prompt-based reasoning, cross-modal reasoning, code synthesis, model invocation, or data-driven LLM-assisted modeling. Its generalizable design enables the flexible selection of problem solving strategies across different signal modalities, task types, and data conditions. We demonstrate the versatility and effectiveness of SignalLLM through five representative tasks in communication and sensing, such as radar target detection, human activity recognition, and text compression. Experimental results show superior performance over traditional and existing LLM-based methods, particularly in few-shot and zero-shot settings.