π€ AI Summary
This work proposes FaceTell, a novel system that demonstrates for the first time that human faces in video conferencing can act as an effective optical side channel by unintentionally reflecting screen content. By integrating computer vision, optical modeling, and machine learning techniques, FaceTell extracts faint facial reflection signals from ordinary webcam footage to accurately infer the userβs currently active application. Evaluated across 13 real-world indoor environments, 24 participants, and four major video conferencing platforms, FaceTell achieves a 99.32% accuracy rate in identifying 28 popular applications, thereby establishing the practical feasibility and severity of this cross-platform side-channel attack under realistic conditions.
π Abstract
In video conferencing, human faces serve as the primary visual focal points, playing multifaceted roles that enhance visual communication and emotional connection. However, we argue that a human face is also a side channel, which can unwittingly leak on-screen information through online video feeds. To demonstrate this, we conduct feasibility studies, which reveal that, illuminated by both ambient light and light emitted from displays, the human face can reflect optical variations of different on-screen content. The paper then proposes FaceTell, a novel side-channel attack system that eavesdrops on fine-grained application activities from pervasive yet subtle facial reflections during video conferencing. We implement FaceTell in a real-world testbed with three different brands of laptops and four mainstream video conferencing platforms. FaceTell is then evaluated with 24 human subjects across 13 unique indoor environments. With more than 12 hours of video data, FaceTell achieves a high accuracy of 99.32% for eavesdropping on 28 popular applications and is resilient to many practical impact factors. Finally, potential countermeasures are proposed to mitigate this new attack.