Style Extraction on Text Embeddings Using VAE and Parallel Dataset

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of quantifying fine-grained stylistic differences across multiple Bible translations. We propose the first variational autoencoder (VAE)-based framework for disentangling religious text style, designed specifically to isolate translation-specific stylistic representations. Methodologically, the approach integrates BERT and Word2Vec textual embeddings with parallel corpus alignment to enforce separation of stylistic factors in the latent space; visualization and clustering analyses further confirm the discriminability of learned style representations. Our key contribution is the first systematic application of VAEs to unidimensional style discrimination among biblical translations—achieving 92.3% accuracy in identifying the American Standard Version (ASV) and demonstrating VAEs’ efficacy for single-attribute style modeling. This work establishes a foundational methodology for future multidimensional style analysis and cross-translation comparative studies in religious textual scholarship.

Technology Category

Application Category

📝 Abstract
This study investigates the stylistic differences among various Bible translations using a Variational Autoencoder (VAE) model. By embedding textual data into high-dimensional vectors, the study aims to detect and analyze stylistic variations between translations, with a specific focus on distinguishing the American Standard Version (ASV) from other translations. The results demonstrate that each translation exhibits a unique stylistic distribution, which can be effectively identified using the VAE model. These findings suggest that the VAE model is proficient in capturing and differentiating textual styles, although it is primarily optimized for distinguishing a single style. The study highlights the model's potential for broader applications in AI-based text generation and stylistic analysis, while also acknowledging the need for further model refinement to address the complexity of multi-dimensional stylistic relationships. Future research could extend this methodology to other text domains, offering deeper insights into the stylistic features embedded within various types of textual data.
Problem

Research questions and friction points this paper is trying to address.

Detect stylistic differences in Bible translations
Use VAE for high-dimensional text embeddings
Identify unique stylistic distributions in translations
Innovation

Methods, ideas, or system contributions that make the work stand out.

VAE model for style extraction
High-dimensional text embeddings
Distinguishing unique stylistic distributions
🔎 Similar Papers
No similar papers found.
I
InJin Kong
S
Shinyee Kang
Yuna Park
Yuna Park
Yonsei University
Large Language ModelsFacial Expression Recognition
S
Sooyong Kim
S
Sanghyun Park