Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Published multiple papers on topics such as discrete flow matching, improved LLM code-generation, simple and controllable music generation, textually pretrained speech language models, self-supervised speech resynthesis with visual input, visually-driven prosody for text-to-speech, on-screen sound separation, robust direct speech-to-speech translation, unsupervised audio-visual separation, image segmentation, image denoising, deep shape correspondence, point cloud sparse coding, and more.
Research Experience
Worked at Facebook AI Research (FAIR) and Google Research, leading projects that pushed the boundaries of AI, including on-device audio-visual speech separation and advancements in LLMs.
Education
No specific education background information provided.
Background
An AI and machine learning researcher with a PhD. Research interests include large language models (LLMs), optimization, visual perception, computational photography, and audio-visual methods for speech enhancement. Focuses on applications of LLMs in audio and music generation, as well as text and code generation. Recently explored techniques like flow matching and diffusion in latent text embeddings.
Miscellany
No personal interests or other related information provided.