🤖 AI Summary
Engineering Manuals (EMs) are lengthy and structurally complex; existing lightweight models treat them as flat text, leading to factual errors, severe hallucinations, and memory inefficiency. To address this, we propose a hierarchical structured framework for EM understanding: (1) a syntax-aware Tree-LSTM fact extractor that precisely parses document structure; (2) a compact Memory-Augmented Neural Network (MANN) coupled with traceable vector indexing for efficient fact retrieval; and (3) a dual-path, six-layer fused Transformer inference architecture—comprising a static-index fast path and an RAG-enhanced dynamic path—to jointly suppress hallucinations. Our model contains only 45.51M parameters (64% fewer than GPT-2), achieves a 21.3% accuracy improvement, significantly reduces hallucination rates, enables sub-second response times, and supports zero-shot adaptation to unseen documents.
📝 Abstract
The user of Engineering Manuals (EM) finds it difficult to read EM s because they are long, have a dense format which includes written documents, step by step procedures, and standard parameter lists for engineering equipment. Off the shelf transformers, especially compact ones, treat this material as a flat stream of tokens. This approach leads to confident but incorrect numeric answers and forces the models to memorize separate facts inefficiently. SMART (Structured Memory and Reasoning Transformer) offers a different and practical solution to the above problem. SMART structures its processing by using a hierarchical approach, and is based upon three main job categories (1) A syntax-aware Fact Extractor (Grammarian) Tree LSTM which extracts facts as subject relation object relations from EM sentences (2) A compact indexed memory MANN (Memory Augmented Neural Network) that indexes these Rational Subject Relation Objects as 384 dimensional vectors that are associated with the source of the information, and (3) A 6 layer Transformer that learns to fuse the previously retrieved facts into its generated response. The entire SMART model utilizes 45.51M parameters, which is 64% less than GPT-2 (124M) and 69% less than BERT (133M), and it achieves a 21.3% higher accuracy than GPT-2, indicating that SMART fits the data better with the least amount of processing requirements. SMART employs dual modes of inference an indexed fast path for known documents (sub-second answer times) and an indexed dynamic path assisted by RAGs for new uploads (FAISS Top 20 results with memory severed at 64 slots). In real world deployment, this framework leads to more well supported results with reduced hallucinations than comparable small transformer models.