Remote Contract
PUBLISHED
Nov 21, 2025
Join Rapid7's innovative team as a Staff Site Reliability Engineer, where you'll drive the reliability, scalability, and performance of our cloud-based security platforms. Collaborate with cross-functional teams to implement robust monitoring, automation, and incident response strategies that keep our services running smoothly 24/7.
Rapid7 is a leading cybersecurity company dedicated to empowering organizations to proactively secure their digital environments. As a Staff Site Reliability Engineer (SRE), you will play a pivotal role in maintaining the uptime, performance, and resilience of our global cloud infrastructure that powers threat detection, vulnerability management, and incident response solutions.
In this senior position, you will lead efforts to design, implement, and optimize systems that ensure our services meet stringent SLAs. You'll work closely with software engineers to embed reliability practices into the development lifecycle, automate operational tasks, and conduct post-incident analyses to prevent future disruptions. Your expertise in distributed systems, fault-tolerant architectures, and chaos engineering will be crucial in scaling our platforms to handle increasing workloads securely and efficiently.
Key responsibilities include proactively monitoring system health, developing runbooks for common issues, participating in on-call rotations, and mentoring junior team members. You'll leverage tools like PagerDuty for alerting, Splunk for logging, and advanced metrics to drive data-informed decisions. At Rapid7, we value innovation and work-life balance, offering you the chance to contribute to cutting-edge security technologies while growing your career in a dynamic, collaborative environment.