About the job
We are looking for an ambitious Systems / Platform Engineer to join a team at the intersection of SRE and low-latency distributed systems. This team will help power Pinterest’s next generation of realtime ML and measurement infrastructure, with a focus on sub-millisecond decisioning, high-throughput data access, and tight integration with Pinterest’s core tech stack.
Responsibilities
Scale the decision making process for tools for the tvScientific AI team, from our workflows to our training infrastructure to our Kubernetes deployments
Improve the developer experience for the data science team
Upgrade our observability tooling
Make every deployment smooth as our infrastructure evolves.
Qualifications
Minimum
Deep understanding of Linux
Excellent writing skills
A systems-oriented mindset
Experience in high-performance software (RTB, HFT, etc.)
Software engineering experience + reliability (e.g. CI/CD) expertise
Strong observability instincts
Demonstrated ability to use AI to improve speed and quality in your day-to-day workflow for relevant outputs
Strong track record of critical evaluation and verification of AI-assisted work (e.g., testing, source-checking, data validation, peer review)
High integrity and ownership: you protect sensitive data, avoid over-reliance on AI, and remain accountable for final decisions and deliverables
Preferred
Reverse-engineering experience
Terraform, EKS, or MLOps experience
Python, Scala, or Zig experience
NixOS experience
Adtech or CTV experience
Experience deploying a distributed system across multiple clouds
Experience in hard real-time low-latency (<10 ms) environments