Principal Data Scientist – R&D DSDH - Preclinical Sciences & Translational Safety (PSTS)

Johnson & Johnson
Spring House, PA, USA / Titusville, NJ, USA / Cambridge, MA, USA2026-05-08Full time

About the job

The R&D Data Science organization is seeking a Data Scientist to leverage advanced machine learning, robust data engineering techniques, and domain expertise to drive impactful decisions and generate actionable insights within the Pharmaceutical Sciences & Translational Safety (PSTS) organization. In this role, you will work closely with multidisciplinary teams—including toxicologists, PK/PD specialists, in vivo researchers, and safety professionals—to create AI-ready datasets, develop predictive models, and deliver analytical solutions that promote improved safety evaluations and facilitate translational research.

Responsibilities

Develop and deploy ML/AI models to support safety signal detection, dose selection, PK/PD modeling, toxicology insights, and translational interpretation.

Implement representation‑learning, predictive modeling, and multivariate analytics for datasets spanning in vivo studies, in vitro assays, exposure‑response data, and pathology information.

Partner with scientific SMEs to design modeling strategies aligned with PSTS decision points.

Apply model governance, versioning, and validation standards consistent with R&D AI practices.

Build and maintain scalable data pipelines that integrate PSTS‑relevant data sources (e.g., toxicology studies, PK/PD datasets, biomarker readouts, animal study repositories).

Transform raw experimental outputs into standardized, analysis‑ready, AI‑ready datasets using Python, R, and cloud‑native services.

Contribute to harmonized scientific data models in collaboration with data engineering and ontology teams.

Work directly with toxicology, DMPK, and safety stakeholders to interpret scientific context and translate study designs into computational requirements.

Apply understanding of mechanism‑based toxicology, exposure‑response concepts, and in vivo study structures to guide data transformations and modeling strategies.

Enhance cross‑study comparability via standardized terminologies, metadata practices, and quality checks.

Collaborate with PSTS functional experts, R&D Data Science teams, and platform architects to ensure high-quality, scalable data solutions.

Qualifications

Minimum

Advanced degree (MS or PhD) in Data Science, Computational Biology, Toxicology, Pharmacology, Biomedical Engineering, Computer Science, or related field.

3+ years of experience applying machine learning and/or data engineering to scientific or biomedical datasets.

Proficiency with Python and/or R, SQL, and modern data engineering tooling (cloud computing, workflow orchestration, version control).

Experience with ML model development, evaluation, and deployment pipelines.

Experience working with biological, toxicology, PK/PD, or in vivo datasets.

Preferred

Experience in safety sciences, ADME/DMPK, toxicogenomics, or biomarker analytics.

Familiarity with scientific data formats (e.g., assay outputs, histopathology data, PK time-course datasets).

Exposure to ontologies, semantic technologies, or knowledge graph integration for scientific domains.

Experience with cloud‑based data architectures (AWS S3, Snowflake, Redshift).

Understanding of regulatory data standards (e.g., SEND, CDISC).