Research interests include generative modeling, world model, and embodied AI, particularly in building generative models that can be controlled by natural language, vision, and action signals, which can be utilized to empower world models and embodied AI applications.