About the job
Successful candidate will develop new methods for context discovery, retrieval, filtering, prioritization, multi-modal data representation, advanced reasoning, tool calling, and reasoning trace validation in conversational, deep research, and agentic AI workflows. Successful candidate will also work on development of capture, management, search, enhancement and interpretation of meta-data and lineage for AI pipelines that enable reproducibility, reuse and optimization of pipelines; discovery, selection and usage of relevant high quality data for trustworthy AI outcomes across multiple AI applications; development, evaluation and testing of Foundation AI models for different modalities: Natural Language Processing - NLP, Large Language Models - LLM, Time Series Analysis, Computer Vision, AI for Science, etc., and augmentation of AI models with structured knowledge (i.e., knowledge infused learning).
Responsibilities
Develop new methods for context discovery, retrieval, filtering, prioritization, multi-modal data representation, advanced reasoning, tool calling, and reasoning trace validation in conversational, deep research, and agentic AI workflows.
Work on development of capture, management, search, enhancement and interpretation of meta-data and lineage for AI pipelines that enable reproducibility, reuse and optimization of pipelines.
Discover, select and use relevant high quality data for trustworthy AI outcomes across multiple AI applications.
Develop, evaluate and test Foundation AI models for different modalities: Natural Language Processing - NLP, Large Language Models - LLM, Time Series Analysis, Computer Vision, AI for Science, etc.
Augment AI models with structured knowledge (i.e., knowledge infused learning).
Qualifications
Minimum
PhD in Computer Science or related fields with a focus on data engineering and data science, in particular Machine Learning, Deep Learning, and/or data management for AI, plus 3 years of relevant industry experience.
Research experience in Generative AI, Deep Learning and Machine Learning
Experience with advanced AI model architectures: LLMs, Time Series Foundation Models, Diffusion Models, etc.
Expertise with end-to-end pipelines for AI and Machine Learning and in particular the data layer underlying the pipeline
Preferred
Strong programming skills in Python with high proficiency in data structures and algorithms. C/C++ skills
Experience with CI/CD code development
Outstanding analytical and problem-solving skills
Experience with hybrid AI-HPC workflows (e.g., AI surrogate modeling, computational steering of experiments)
Experience with knowledge graphs and knowledge infused learning
Expertise in research of data and workflow management systems
Experience in system software performance and scalability optimization
Experience with multi-threaded programming, parallel processing, OOD/OOP/distributed programming
Experience in containerized development and orchestration tools (e.g. Kubernetes, Ezmeral)