About the job
It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you're looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business and the world. Our Tooling and Reliability Platforms team operates as a foundational pillar of the Central Technology Organization. We provide the paved road for Yahoo's diverse verticals, enabling them to ship world-class products at a global scale. Our mission is to build the modern, secure, and highly efficient platforms that power all of Yahoo's brands, with a relentless focus on Engineered Resilience. The Tooling and Reliability team is looking for a Software Dev Engineer I (IC2) focused on building and maintaining the vital services that support our primary incident management platform. You will be at the forefront of our AIOps initiative, building new AI services designed to provide service teams with instant context and automated triaging capabilities during high-pressure incidents. In this role, you will be expected to utilize AI-augmented workflows to increase engineering velocity and ensure the resilience of our global platforms.
Responsibilities
Build, deploy, and maintain high-availability services that integrate our core Incident Management tools with Yahoo's internal ecosystem.
Develop new AI services and agents that ingest massive amounts of telemetry to provide on-call engineers with real-time summaries, historical context, and automated root-cause hypotheses.
Use Infrastructure as Code (Terraform/CloudFormation) to manage serverless infrastructure, ensuring reliability tools are as resilient as the services they monitor.
Identify and implement AI-driven efficiencies in your day-to-day development, replacing manual, repetitive tasks with automated or AI-assisted workflows.
Leverage AI pair-programming tools to accelerate code reviews and ensure high unit-test coverage for all new reliability services.
Verify and validate AI-generated code and infrastructure outputs to ensure they meet Yahoo's security and resilience standards.
Qualifications
Minimum
Bachelor's degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience).
Proficiency in at least one programming language (Python or Go preferred).
Working knowledge of AI-assisted development tools (e.g., GitHub Copilot, Amazon CodeWhisperer, or Cursor).
Understanding of Linux/Unix environments and basic networking concepts.
Familiarity with Git and version control.
Preferred
Experience with Cloud providers (AWS, GCP).
Familiarity with Docker, Kubernetes, or serverless architectures.
Demonstrated experience using Prompt Engineering to assist in debugging complex system failures or generating technical documentation.
Interest in Machine Learning applications for DevOps and site reliability.