Identifying bubble-like subgraphs in linear-time via a unified SPQR-tree framework

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a unified framework for efficiently identifying bubble-like subgraphs—specifically superbubbles, snarls, and ultrabubbles—associated with genomic variation in both directed and bidirected graphs. Leveraging SPQR tree decomposition, the method integrates dynamic programming traversal, feedback arc set computation, and 2-separator analysis to achieve, for the first time, linear-time identification of snarls and ultrabubbles, along with a linear-size representation of snarls. Notably, the study proves that the feedback arc set problem can be solved in linear time on endpoint-free bidirected graphs. By unifying the detection of all three subgraph types within a single linear-time algorithm, this approach substantially enhances the efficiency of genomic graph structural analysis.
📝 Abstract
A fundamental algorithmic problem in computational biology is to find all subgraphs of a given type (superbubbles, snarls, and ultrabubbles) in a directed or bidirected input graph. These correspond to regions of genetic variation and are useful in analyzing collections of genomes. We present the first linear-time algorithms for identifying all snarls and all ultrabubbles, resolving problems open since 2018. The algorithm for snarls relies on a new linear-size representation of all snarls with respect to the number of vertices in the graph. We employ the well-known SPQR-tree decomposition, which encodes all 2-separators of a biconnected graph. After several dynamic-programming-style traversals of this tree, we maintain key properties (such as acyclicity) that allow us to decide whether a given 2-separator defines a subgraph to be reported. A crucial ingredient for linear-time complexity is that acyclicity of linearly many subgraphs can be tested simultaneously via the problem of computing all arcs in a directed graph whose removal renders it acyclic (so-called feedback arcs). As such, we prove a fundamental result for bidirected graphs, that may be of independent interest: all feedback arcs can be computed in linear time for tipless bidirected graphs, while in general this is at least as hard as matrix multiplication, assuming the k-Clique Conjecture. Our results form a unified framework that also yields a completely different linear-time algorithm for finding all superbubbles. Although some of the results are technically involved, the underlying ideas are conceptually simple, and may extend to other bubble-like subgraphs. More broadly, our work contributes to the theoretical foundations of computational biology and advances a growing line of research that uses SPQR-tree decompositions as a general tool for designing efficient algorithms, beyond their traditional role in graph drawing.
Problem

Research questions and friction points this paper is trying to address.

superbubbles
snarls
ultrabubbles
computational biology
directed graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

SPQR-tree
linear-time algorithm
bubble-like subgraphs
feedback arcs
computational biology
🔎 Similar Papers
No similar papers found.
F
Francisco Sena
Department of Computer Science, University of Helsinki, Helsinki, Finland
A
Aleksandr Politov
Department of Computer Science, University of Helsinki, Helsinki, Finland
C
Corentin Moumard
ENS Lyon, Lyon, France
M
Massimo Cairo
Department of Computer Science, University of Helsinki, Helsinki, Finland
Romeo Rizzi
Romeo Rizzi
Professore Associato di Ricerca Operativa, Verona University
ottimizzazione combinatoriaalgoritmicomplessita computazionalericerca operativabiologia computazion
Manuel Cáceres
Manuel Cáceres
Aalto University
Graph AlgorithmsAlgorithmic BioinformaticsCompressed Data Structures
S
Sebastian Schmidt
Department of Computer Science, University of Helsinki, Helsinki, Finland
Juha Harviainen
Juha Harviainen
Postdoctoral Researcher, University of Helsinki
Randomized AlgorithmsPerfect SamplingProbabilistic Graphical ModelsParameterized Algorithms
A
Alexandru I. Tomescu
Department of Computer Science, University of Helsinki, Helsinki, Finland