🤖 AI Summary
This study investigates whether promotional Twitter/X bots form stable behavioral families and how such families evolve over time.
Method: We propose a two-layer “behavioral family + mutation spectrum” modeling framework: (1) encoding bot temporal behavior into seven-dimensional discrete symbol sequences, constructing non-overlapping k=7 block frequency vectors, and applying hierarchical clustering based on cosine similarity to identify behavioral families; (2) performing multiple sequence alignment and annotating mutations—insertions, deletions, substitutions, state changes, and identities—to characterize family-specific mutation hotspots, rates, and predictable responses to external events (e.g., holidays).
Contribution/Results: We identify four structurally stable behavioral families with distinct lifespans. Empirical analysis reveals significantly higher intra-family mutation sharing, confirming family-dependent behavioral evolution. The framework enables fine-grained, dynamic modeling of bot adaptive behavior, offering novel insights into the structural regularity and evolutionary mechanisms of social media automation.
📝 Abstract
This paper asks whether promotional Twitter/X bots form behavioural families and whether members evolve similarly. We analyse 2,798,672 tweets from 2,615 ground-truth promotional bot accounts (2006-2021), focusing on complete years 2009 to 2020. Each bot is encoded as a sequence of symbolic blocks (``digital DNA'') from seven categorical post-level behavioural features (posting action, URL, media, text duplication, hashtags, emojis, sentiment), preserving temporal order only. Using non-overlapping blocks (k=7), cosine similarity over block-frequency vectors, and hierarchical clustering, we obtain four coherent families: Unique Tweeters, Duplicators with URLs, Content Multipliers, and Informed Contributors. Families share behavioural cores but differ systematically in engagement strategies and life-cycle dynamics (beginning/middle/end). We then model behavioural change as mutations. Within each family we align sequences via multiple sequence alignment (MSA) and label events as insertions, deletions, substitutions, alterations, and identity. This quantifies mutation rates, change-prone blocks/features, and mutation hotspots. Deletions and substitutions dominate, insertions are rare, and mutation profiles differ by family, with hotspots early for some families and dispersed for others. Finally, we test predictive value: bots within the same family share mutations more often than bots across families; closer bots share and propagate mutations more than distant ones; and responses to external triggers (e.g., Christmas, Halloween) follow family-specific, partly predictable patterns. Overall, sequence-based family modelling plus mutation analysis provides a fine-grained account of how promotional bot behaviour adapts over time.