🤖 AI Summary
In the low-resource biomedical domain of gut microbiome research, relation extraction (RE) suffers from high annotation noise and redundant contextual information in raw literature. Method: This paper proposes a novel generative RE paradigm: first, domain-adaptive abstractive summarization of original texts using large language models (LLMs), followed by instruction-tuned generative modeling to directly output structured relation triples. Contribution/Results: To our knowledge, this is the first work integrating summarization as a preprocessing step into generative RE pipelines—effectively reducing textual noise and enhancing focus on salient contextual cues. Experiments on a newly constructed microbiome corpus demonstrate substantial improvements over baseline generative models. Although performance remains slightly below that of BERT-based discriminative models, the approach validates the feasibility and promise of generative RE in specialized, data-scarce biomedical settings. It establishes a lightweight, scalable paradigm for few-shot biomedical knowledge extraction.
📝 Abstract
We explore a generative relation extraction (RE) pipeline tailored to the study of interactions in the intestinal microbiome, a complex and low-resource biomedical domain. Our method leverages summarization with large language models (LLMs) to refine context before extracting relations via instruction-tuned generation. Preliminary results on a dedicated corpus show that summarization improves generative RE performance by reducing noise and guiding the model. However, BERT-based RE approaches still outperform generative models. This ongoing work demonstrates the potential of generative methods to support the study of specialized domains in low-resources setting.