StructVizor: Interactive Profiling of Semi-Structured Textual Data

📅 2025-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Semi-structured text data lacks efficient, automated analysis methods. Method: This paper proposes a structure-aware interactive profiling and transformation framework that integrates rule-enhanced text parsing, unsupervised structural pattern mining, scalable graph visualization, and multi-granularity interactions—including template dragging and semantic remapping. Contribution/Results: It is the first approach to deeply couple structural visualization with profiling-driven cleaning and transformation. A user study (n=12) demonstrates that, compared to conventional tools, our framework reduces task completion time by 47% on average and improves structural understanding accuracy by 3.2×, significantly enhancing both exploratory analysis and data transformation efficiency.

Technology Category

Application Category

📝 Abstract
Data profiling plays a critical role in understanding the structure of complex datasets and supporting numerous downstream tasks, such as social media analytics and financial fraud detection. While existing research predominantly focuses on structured data formats, a substantial portion of semi-structured textual data still requires ad-hoc and arduous manual profiling to extract and comprehend its internal structures. In this work, we propose StructVizor, an interactive profiling system that facilitates sensemaking and transformation of semi-structured textual data. Our tool mainly addresses two challenges: a) extracting and visualizing the diverse structural patterns within data, such as how information is organized or related, and b) enabling users to efficiently perform various wrangling operations on textual data. Through automatic data parsing and structure mining, StructVizor enables visual analytics of structural patterns, while incorporating novel interactions to enable profile-based data wrangling. A comparative user study involving 12 participants demonstrates the system's usability and its effectiveness in supporting exploratory data analysis and transformation tasks.
Problem

Research questions and friction points this paper is trying to address.

Extracting and visualizing structural patterns in semi-structured textual data
Enabling efficient data wrangling operations on textual data
Supporting exploratory data analysis and transformation tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interactive profiling system for semi-structured textual data
Automatic data parsing and structure mining
Visual analytics with profile-based data wrangling
🔎 Similar Papers
No similar papers found.