NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the challenge of modeling operating-system-level GUI dynamics end-to-end for pixel-level screen frame generation from raw user inputs (mouse/keyboard events). We propose the first end-to-end neural simulation framework that jointly models GUI state evolution—using a recurrent neural network—and high-fidelity image synthesis—via a diffusion-based neural renderer—trained on large-scale real-world Ubuntu XFCE screen recordings. Our contributions are threefold: (1) unified modeling of GUI state transitions and corresponding visual outputs; (2) integration of AI agents to synthesize high-quality interactive trajectories, mitigating scarcity of human-annotated interaction data; and (3) superior performance over baselines on complex state-transition tasks such as application launching and window switching. Experiments demonstrate photorealistic reproduction of fine-grained interaction details—including cursor motion and animated transitions—advancing the frontier of generative human–computer interface modeling.

Technology Category

Application Category

📝 Abstract

We introduce NeuralOS, a neural framework that simulates graphical user interfaces (GUIs) of operating systems by directly predicting screen frames in response to user inputs such as mouse movements, clicks, and keyboard events. NeuralOS combines a recurrent neural network (RNN), which tracks computer state, with a diffusion-based neural renderer that generates screen images. The model is trained on a large-scale dataset of Ubuntu XFCE recordings, which include both randomly generated interactions and realistic interactions produced by AI agents. Experiments show that NeuralOS successfully renders realistic GUI sequences, accurately captures mouse interactions, and reliably predicts state transitions like application launches. Although modeling fine-grained keyboard interactions precisely remains challenging, NeuralOS offers a step toward creating fully adaptive, generative neural interfaces for future human-computer interaction systems.

Problem

Research questions and friction points this paper is trying to address.

Simulating OS GUIs via neural models for screen prediction

Combining RNN and diffusion renderer for state-aware rendering

Training on Ubuntu XFCE data to capture GUI interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural framework simulating GUIs via generative models

Combines RNN for state tracking with diffusion renderer

Trained on large-scale Ubuntu XFCE interaction recordings

🔎 Similar Papers

No similar papers found.