Data Scientist 2 · Optum / UnitedHealth Group

About the job

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.

Responsibilities

Participate in the design, development, and deployment of applied machine learning solutions addressing complex business and clinical problems using large-scale healthcare data

Drive the end-to-end model lifecycle: problem framing, feature engineering, model development, evaluation, validation, explainability, deployment, and post-production monitoring

Develop and review production-grade Python code following software engineering best practices (testing, modularization, version control, CI/CD)

Support architecting scalable data science workflows using Python, SQL, and distributed data processing frameworks in cloud or enterprise environments

Apply and advance classical ML, deep learning, time-series modeling, and survival analysis techniques based on business needs

Ensure models are interpretable, explainable, and compliant with enterprise governance, regulatory, and ethical standards (e.g., bias, fairness, auditability)

Partner with engineering, product, clinical, and business stakeholders to translate ambiguous problems into actionable analytical solutions

Review and approve modeling approaches, assumptions, and results; influence architectural and methodological decisions across teams

Communicate insights, risks, and tradeoffs clearly to technical and executive audiences

Stay current with emerging methods in applied ML, healthcare analytics, and MLOps, and drive adoption of best practices

Qualifications

Minimum

4+ years of experience building production-quality, maintainable, and testable code

4+ years of experience with machine learning and statistical modeling fundamentals, including:

Feature engineering and selection

Model training, tuning, and evaluation

Model interpretability and explainability (e.g., SHAP, feature attribution)

4+ years of hands-on experience with deep learning architectures where appropriate

4+ years of experience with time-series analysis and survival analysis

4+ years of experience with vibe coding tools, such as Cursor, Claude Code, and Windsurf

4+ years of experience in healthcare data literacy, including experience with:

Claims, EHR, lab, and pharmacy data

Coding systems such as ICD, CPT, NDC, SNOMED, and LOINC

Interoperability standards such as FHIR and HL7

Reasoning about data quality, missingness, bias, and confounding in healthcare datasets

4+ years of experience as a contributor in complex applied data science initiatives from concept to production

4+ years of experience working in cross-functional environments with engineering, product, and business teams

4+ years of experience balancing model sophistication, interpretability, scalability, and business impact

Advanced level of proficiency in Python for data science and ML (Pandas, NumPy, scikit-learn, PyTorch or equivalent)

Advanced level of SQL skills for complex data transformations and analytical workflows

Preferred

Experience with MLOps practices (deployment, monitoring, retraining, drift detection)

Prior experience in regulated or highly governed environments

Familiarity with cloud platforms and distributed computing (e.g., Spark, Databricks, AWS, GCP, Azure)