Research Intern - AI Safety & Reliability for LLM Systems

Microsoft
San Francisco Bay area / New York City metropolitan area2026-02-10onsite

About the job

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. This Research Internship focuses on improving the reliability and trustworthiness of artificial intelligence (AI) systems that support complex, real-world decision-making. The Research Intern will study how large language model (LLM)–based assistants behave when relevant information is incomplete or unevenly available and explore methods for detecting such gaps and adapting system responses accordingly. The work emphasizes uncertainty awareness, responsible reasoning, and robustness, contributing to safer and more dependable AI systems in enterprise settings.

Responsibilities

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Qualifications

Minimum

Currently enrolled in a PhD program in computer science, machine learning, statistics, human-computer interaction or a related field.

Preferred

Proficiency in Python and experience with common ML and data processing libraries. Experience with large language models and/or retrieval-augmented generation (RAG) or related approaches. Prior research experience in machine learning, NLP, or human-centered AI, demonstrated through publications, preprints, or substantial projects suitable for peer-reviewed venues such as NeurIPS, ICML, FAccT, AIES, CHI, or CSCW. Proficient written and verbal communication skills for presenting and documenting research. Interest in AI reliability, robustness, safety, or responsible AI research.