HHFT: Hierarchical Heterogeneous Feature Transformer for Recommendation Systems

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In industrial-scale CTR prediction, conventional DNNs suffer from limited capability in modeling heterogeneous features (e.g., user profiles, item attributes, behavioral sequences). To address this, we propose HiFormer—a Hierarchical Heterogeneous Feature Transformer architecture. Its key contributions are: (1) a semantics-driven feature chunking mechanism with chunk-specific QKV projections, preventing semantic interference across feature types; and (2) synergistic modeling via heterogeneous Transformer encoders and lightweight HiFormer layers, explicitly capturing high-order cross-type interactions. HiFormer departs from traditional homogeneous modeling paradigms, enabling fine-grained semantic awareness and efficient cross-domain feature fusion. Deployed in Taobao’s production environment, HiFormer achieves a +0.4% improvement in CTR AUC and a +0.6% increase in GMV, delivering significant gains in core business metrics.

Technology Category

Application Category

📝 Abstract
We propose HHFT (Hierarchical Heterogeneous Feature Transformer), a Transformer-based architecture tailored for industrial CTR prediction. HHFT addresses the limitations of DNN through three key designs: (1) Semantic Feature Partitioning: Grouping heterogeneous features (e.g. user profile, item information, behaviour sequennce) into semantically coherent blocks to preserve domain-specific information; (2) Heterogeneous Transformer Encoder: Adopting block-specific QKV projections and FFNs to avoid semantic confusion between distinct feature types; (3) Hiformer Layer: Capturing high-order interactions across features. Our findings reveal that Transformers significantly outperform DNN baselines, achieving a +0.4% improvement in CTR AUC at scale. We have successfully deployed the model on Taobao's production platform, observing a significant uplift in key business metrics, including a +0.6% increase in Gross Merchandise Value (GMV).
Problem

Research questions and friction points this paper is trying to address.

Addressing DNN limitations in CTR prediction systems
Handling heterogeneous feature integration without semantic confusion
Capturing high-order interactions across diverse feature types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Transformer for heterogeneous feature processing
Semantic Feature Partitioning to group domain-specific blocks
Block-specific QKV projections to prevent semantic confusion
🔎 Similar Papers
No similar papers found.
L
Liren Yu
Taobao&Tmall Group of Alibaba, Hangzhou, China
W
Wenming Zhang
Taobao&Tmall Group of Alibaba, Hangzhou, China
S
Silu Zhou
Taobao&Tmall Group of Alibaba, Hangzhou, China
Zhixuan Zhang
Zhixuan Zhang
Alibaba Group
natural language processing
D
Dan Ou
Taobao&Tmall Group of Alibaba, Hangzhou, China