Principal Data Scientist - Remote · Optum / UnitedHealth Group

About the job

As a Principal Data Scientist, you will serve as a senior individual contributor responsible for designing and delivering advanced AI/ML and Generative AI systems that address complex healthcare and operational challenges. You will lead the development of large-scale machine learning solutions, architect GenAI and agent-based systems, and drive technical innovation across high-impact AI initiatives. This role requires deep hands-on expertise in machine learning, LLM systems, distributed AI architectures, and production-grade ML platforms.

Responsibilities

Lead development of advanced AI/ML systems using techniques such as deep learning, representation learning, time-series modeling, survival analysis, and probabilistic modeling to solve complex healthcare problems

Develop Generative AI and LLM-powered solutions including retrieval-augmented generation (RAG) pipelines, domain-adapted LLMs, and AI copilots for enterprise workflows

Architect scalable AI and ML systems including feature engineering pipelines, feature stores, model training workflows, model serving infrastructure, monitoring systems, and automated retraining pipelines

Build agentic AI and autonomous workflows using agent frameworks, agentic skills, MCP integrations, and agent-to-agent (A2A) communication patterns

Advance model evaluation, reliability, and monitoring strategies including offline metrics, LLM benchmarking, safety testing, hallucination mitigation, and drift detection

Drive responsible AI practices including explainability, interpretability, bias detection, fairness evaluation, and governance aligned with enterprise and regulatory standards

Serve as a senior technical authority in AI/ML by mentoring data scientists and reviewing complex modeling approaches and architectures

Collaborate with engineering, platform, and product teams to operationalize scalable AI/ML and GenAI systems within enterprise platforms

Qualifications

Minimum

10+ years of experience in machine learning, artificial intelligence, or applied data science with 7+ years of designing and deploying production machine learning systems

8+ years of experience in Python-based machine learning development using frameworks such as PyTorch, TensorFlow, or equivalent along with solid SQL skills

5+ years of experience deploying production ML systems including model serving, monitoring, ML lifecycle management, and collaboration with engineering teams

3+ years of experience developing Generative AI or LLM-based applications including prompt engineering, RAG pipelines, LLM evaluation, and safety guardrails

1+ years of experience building or evaluating agentic AI systems including AI agents, agentic skills, Model Context Protocol (MCP), agent-to-agent (A2A) interaction patterns, or autonomous workflows

Preferred

Experience working with distributed ML systems and large-scale data platforms such as Spark, Databricks, Ray, or Kubernetes-based ML systems

Experience deploying AI solutions on cloud platforms such as AWS, Azure, or GCP

Experience working with healthcare datasets and standards such as claims, EHR, ICD, CPT, SNOMED, FHIR, or HL7

External contributions such as publications, patents, or open-source projects in machine learning or generative AI