Causal inference of post-transcriptional regulation timelines from long-read sequencing in Arabidopsis thaliana

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study aims to reconstruct the dynamic temporal sequence of post-transcriptional maturation of the *ndhB* and *ndhD* genes in *Arabidopsis thaliana* chloroplasts. Method: We propose a three-stage causal temporal reconstruction framework tailored for long-read RNA sequencing data, grounded in Pearl’s causal inference theory. It integrates multiple causal discovery algorithms (HC, PC, LiNGAM, and NOTEARS), enhances NOTEARS regularization via stability selection, and jointly handles missing data and Bayesian network estimation using the EM algorithm. Contribution/Results: The framework yields four high-fidelity, highly reliable maturation timelines that significantly outperform existing reference timelines. It generates experimentally testable intervention hypotheses—marking the first systematic application of causal inference to post-transcriptional regulatory timing modeling. This work establishes a novel, interpretable, and predictive paradigm for investigating RNA maturation mechanisms in plants.

Technology Category

Application Category

📝 Abstract
We propose a novel framework for reconstructing the chronology of genetic regulation using causal inference based on Pearl's theory. The approach proceeds in three main stages: causal discovery, causal inference, and chronology construction. We apply it to the ndhB and ndhD genes of the chloroplast in Arabidopsis thaliana, generating four alternative maturation timeline models per gene, each derived from a different causal discovery algorithm (HC, PC, LiNGAM, or NOTEARS). Two methodological challenges are addressed: the presence of missing data, handled via an EM algorithm that jointly imputes missing values and estimates the Bayesian network, and the selection of the $ell_1$-regularization parameter in NOTEARS, for which we introduce a stability selection strategy. The resulting causal models consistently outperform reference chronologies in terms of both reliability and model fit. Moreover, by combining causal reasoning with domain expertise, the framework enables the formulation of testable hypotheses and the design of targeted experimental interventions grounded in theoretical predictions.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing genetic regulation timelines using causal inference
Addressing missing data and parameter selection challenges
Developing testable hypotheses through causal reasoning integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal inference reconstructs genetic regulation chronology
EM algorithm handles missing data in Bayesian networks
Stability selection optimizes regularization parameter in NOTEARS
🔎 Similar Papers
No similar papers found.