About the job
As a Research Intern, you’ll contribute to cutting-edge research in multimodal AI, focusing on the synergy between vision and language. You’ll explore Large Language Models (LLMs), Small Language Models (SLMs), and Vision-Language Models (VLMs) to tackle problems like video understanding, document layout analysis, chart interpretation, and multi-page question answering. This role offers hands-on experience in leveraging modern LLMs for document understanding, grounding, and retrieval-based generation. You’ll prototype, experiment, and publish impactful research under the guidance of Microsoft mentors—gaining exposure to both fundamental and applied research in a dynamic environment.
Responsibilities
Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week internship, Research Interns are paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.
Qualifications
Minimum
Currently enrolled in a PhD program in Computer Vision, Natural Language Processing, Deep Learning, Machine Learning, AI, or a related field. At least 1 year of experience in LLM, NLP, computer vision, Deep learning, or multimodal research with hands-on deep learning experience. Other Requirements: Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship. In addition to the qualifications below, you’ll need to submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples. After you submit your application, a request for letters may be sent to your list of references on your behalf. Note that reference letters cannot be requested until after you have submitted your application, and furthermore, that they might not be automatically requested for all candidates. You may wish to alert your letter writers in advance, so they will be ready to submit your letter.
Preferred
Proficient algorithmic problem solving and software development skills (Python, C/C++, etc.). Experience with open-source tools such as PyTorch, etc. Publication(s) in top-tier conferences or journals in related fields (e.g., ACL, CVPR, ECCV, ICCV, EMNLP, NAACL, NIPS, ICML, ICLR, IJCV, PAMI, etc.).