Remote Contract
PUBLISHED
Nov 14, 2025
EPAM Systems is seeking a seasoned Lead Site Reliability Engineer with expertise in Azure to drive reliability and scalability for our cloud-based infrastructure. In this role, you will lead a team in optimizing system performance, automating deployments, and ensuring high availability for mission-critical applications.
Join EPAM Systems, a global leader in digital platform engineering and software development, as a Lead Site Reliability Engineer specializing in Azure. In this pivotal role, you will spearhead efforts to enhance the reliability, scalability, and efficiency of our clients' cloud infrastructures on the Microsoft Azure platform.
As the lead for SRE practices, you will architect and implement robust monitoring solutions, automate infrastructure provisioning, and optimize application performance to minimize downtime and maximize uptime. You will collaborate closely with development, operations, and security teams to foster a culture of DevOps excellence, driving continuous improvement through data-driven insights and proactive problem resolution.
Key responsibilities include leading on-call rotations, conducting root cause analyses for incidents, and developing SLOs/SLIs to align with business objectives. You will also mentor team members, contribute to technical roadmaps, and stay abreast of emerging Azure technologies to innovate our service delivery.
If you thrive in dynamic, fast-paced environments and have a passion for building resilient systems, this opportunity at EPAM offers the chance to make a significant impact on high-profile projects across industries.