Quantized-Tinyllava: a new multimodal foundation model enables efficient split learning

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the high communication overhead and privacy challenges in split learning of large models across distributed devices, this paper proposes an efficient split learning framework tailored for multimodal foundation models. The method introduces: (1) a learnable embedding quantization scheme grounded in entropy coding theory, mapping high-dimensional embeddings to low-bit integer representations; (2) joint end-to-end optimization of the discretized hierarchical structure and multimodal model architecture; and (3) substantial reduction in cross-device communication while preserving model accuracy. Experimental results demonstrate that the approach reduces communication overhead by over 60% in resource-constrained edge settings, strictly adhering to data-localization privacy constraints. This work provides a practical pathway for collaborative training of multimodal large models at the edge.

Technology Category

Application Category

📝 Abstract

Split learning is well known as a method for resolving data privacy concerns by training a model on distributed devices, thereby avoiding data sharing that raises privacy issues. However, high network communication costs are always an impediment to split learning, especially for large foundation models that require transmitting large amounts of high-dimensional data. To resolve this issue, we present a new multimodal model structure that incorporates a learning-based data compression method, which compresses model embeddings into low-bit integers while preserving the model's performance, greatly reducing the transmission costs between partitions. We then determine the optimal number of discrete representation levels based on a solid theoretical foundation from entropy coding.

Problem

Research questions and friction points this paper is trying to address.

Reducing network communication costs in split learning

Compressing model embeddings into low-bit integers efficiently

Determining optimal discrete representation levels using entropy coding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal model structure enables efficient split learning

Learning-based compression reduces embedding transmission costs

Optimal discrete levels determined using entropy coding theory

🔎 Similar Papers

No similar papers found.

Authors to Follow