SMART SLM: Structured Memory and Reasoning Transformer, A Small Language Model for Accurate Document Assistance

📅 2025-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Engineering Manuals (EMs) are lengthy and structurally complex; existing lightweight models treat them as flat text, leading to factual errors, severe hallucinations, and memory inefficiency. To address this, we propose a hierarchical structured framework for EM understanding: (1) a syntax-aware Tree-LSTM fact extractor that precisely parses document structure; (2) a compact Memory-Augmented Neural Network (MANN) coupled with traceable vector indexing for efficient fact retrieval; and (3) a dual-path, six-layer fused Transformer inference architecture—comprising a static-index fast path and an RAG-enhanced dynamic path—to jointly suppress hallucinations. Our model contains only 45.51M parameters (64% fewer than GPT-2), achieves a 21.3% accuracy improvement, significantly reduces hallucination rates, enables sub-second response times, and supports zero-shot adaptation to unseen documents.

Technology Category

Application Category

📝 Abstract
The user of Engineering Manuals (EM) finds it difficult to read EM s because they are long, have a dense format which includes written documents, step by step procedures, and standard parameter lists for engineering equipment. Off the shelf transformers, especially compact ones, treat this material as a flat stream of tokens. This approach leads to confident but incorrect numeric answers and forces the models to memorize separate facts inefficiently. SMART (Structured Memory and Reasoning Transformer) offers a different and practical solution to the above problem. SMART structures its processing by using a hierarchical approach, and is based upon three main job categories (1) A syntax-aware Fact Extractor (Grammarian) Tree LSTM which extracts facts as subject relation object relations from EM sentences (2) A compact indexed memory MANN (Memory Augmented Neural Network) that indexes these Rational Subject Relation Objects as 384 dimensional vectors that are associated with the source of the information, and (3) A 6 layer Transformer that learns to fuse the previously retrieved facts into its generated response. The entire SMART model utilizes 45.51M parameters, which is 64% less than GPT-2 (124M) and 69% less than BERT (133M), and it achieves a 21.3% higher accuracy than GPT-2, indicating that SMART fits the data better with the least amount of processing requirements. SMART employs dual modes of inference an indexed fast path for known documents (sub-second answer times) and an indexed dynamic path assisted by RAGs for new uploads (FAISS Top 20 results with memory severed at 64 slots). In real world deployment, this framework leads to more well supported results with reduced hallucinations than comparable small transformer models.
Problem

Research questions and friction points this paper is trying to address.

Addresses difficulty in reading long, dense engineering manuals
Solves confident but incorrect numeric answers from flat token processing
Reduces hallucinations in document assistance with structured memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical processing with syntax-aware fact extraction
Compact indexed memory for efficient fact storage
Dual inference modes for fast and dynamic responses
🔎 Similar Papers
No similar papers found.
D
Divij Dudeja
Department of Computer Science Engineering, Indian Institute of Information Technology, Nagpur, Nagpur 441108, IN
Mayukha Pal
Mayukha Pal
Global R&D Leader - Cloud & Advanced Analytics, ABB Ability Innovation Center
Data SciencePhysics-Aware AnalyticsPower System AnalyticsBiomedical Signal Processing