Member of Technical Staff - Data Platform

About the job

If you are excited by the challenge of designing distributed systems that process petabytes of data for the world's most advanced AI models, this is your team. We are not looking for someone to just write queries or maintain legacy pipelines. We are looking for Systems Builders—engineers who understand the internals of distributed compute, who treat data infrastructure as a product, and who want to architect the backbone of Microsoft Copilot.

Responsibilities

Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently, abstracting away the complexity of "ETL" into self-service infrastructure.

Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures, utilizing modern streaming architecture to reduce latency for AI inference.

Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets.

AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation.

Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform) rather than manual configuration.

Data Reliability Engineering: Move beyond simple "validation checks" to build automated governance and observability systems that detect anomalies in the data mesh before they impact downstream models.

Compute Optimization: Deep-dive into query execution plans and cluster performance. Optimize shuffle operations, partition strategies, and resource allocation to ensure our platform is as cost-efficient as it is fast.

Qualifications

Minimum

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience.

Preferred

Bachelor's or Master's Degree in Computer Science, Software Engineering, or related technical field.4+ years of experience in Software Engineering or Data Infrastructure.Proficiency in Python, Scala, Java, or Go. You write production-grade application code with unit tests, CI/CD, and modular design.Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake). You should understand internals like query planning, memory management, and distributed consistency.Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi).Experience building internal developer platforms or "Data-as-a-Service" APIs.Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing.Experience with container orchestration (Kubernetes) for deploying data applications.Experience enabling AI/ML workloads (Feature Stores, Vector Databases).