Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)

About the job

The Meta Reality Labs Research Team brings together a world-class team of researchers, developers, and engineers to create the future of contextual AI and robotics. The Surreal Vision group at RL Research is seeking exceptional Research Scientists to research and help build the egocentric machine perception functionalities that will underpin future contextual AI-enabled devices. The research intern will work on cutting edge research problems to innovate novel computer vision and machine learning techniques.

Responsibilities

Plan and execute cutting-edge research and development to advance the state-of-the-art in machine learning and large-scale training.

Collaborate with other researchers and engineers across machine perception teams at Meta to develop experiments, prototypes, and concepts that advance the state-of-the-art contextual AI and robotic systems.

Work with the team to help design, setup, and run practical experiments and prototype systems related to large-scale high-quality sensing and machine reasoning.

Qualifications

Minimum

Currently has, or is in the process of obtaining a PhD degree in the domain of computer-vision, computer graphics, 3D machine perception or deep learning

Knowledge in deep learning, computer vision, graphics, generative modeling, LLMs and VLMs

Hands-on experience with implementing deep learning algorithms, large-scale training, benchmark and evaluation

Experience working within Python environments such as pytorch

Experience working in a Unix environment

Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Preferred

Preference for 24 week full time internship

Intent to return to a degree-program after the completion of the internship

Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at top tier conferences such as CVPR, ECCV, ICCV, SIGGRAPH, ICLR and NeurIPS

Strong track-record of published research in the fields of LLMs, VLMs, video generation, world modeling, VLA, human motion modeling, policy learning, generative modeling etc

Strong programming experience using python and pytorch

Demonstrated software engineer experience via an internship