🤖 AI Summary
Traditional materials research relies on unstructured textual narratives, hindering systematic database construction and impeding machine learning applications. Method: This work introduces a “language-native lightweight-structured database” paradigm, specifically for boron nitride nanosheet (BNNS)-polymer thermal-conductive composites. It systematically integrates heterogeneous textual fragments—from synthesis protocols and characterization reports to theoretical calculations and mechanistic reasoning—while preserving scientific narrative integrity. The database supports composite queries combining semantic search, keyword matching, and numerical filtering. Contribution/Results: Serving as a high-fidelity knowledge foundation, it significantly enhances retrieval-augmented generation (RAG) and tool-augmented agents’ joint reasoning and retrieval capabilities in materials discovery. It enables traceable, verifiable standard operating procedure (SOP) generation, advancing large language models in materials R&D from broad “skim-reading” to precise, context-aware “deep interpretation.”
📝 Abstract
Chemical and materials research has traditionally relied heavily on knowledge narrative, with progress often driven by language-based descriptions of principles, mechanisms, and experimental experiences, rather than tables, limiting what conventional databases and ML can exploit. We present a language-native database for boron nitride nanosheet (BNNS) polymer thermally conductive composites that captures lightly structured information from papers across preparation, characterization, theory-computation, and mechanistic reasoning, with evidence-linked snippets. Records are organized in a heterogeneous database and queried via composite retrieval with semantics, key words and value filters. The system can synthesizes literature into accurate, verifiable, and expert style guidance. This substrate enables high fidelity efficient Retrieval Augmented Generation (RAG) and tool augmented agents to interleave retrieval with reasoning and deliver actionable SOP. The framework supplies the language rich foundation required for LLM-driven materials discovery.