We are looking for an Engineering and Operations Leader to help create the Region Reliability organization. Region Reliability is the cornerstone of continuous engineering improvement by driving efficiencies across AWS through innovation, automation, and operational excellence. This new organization was created to drive consistency in operational practices across AWS, reducing duplicative efforts and refining and accelerating operational excellence and best practices for AWS Amazon Dedicated Cloud Regions to provide a simpler experience for our customers and AWS Builders.
Do you like helping U.S. Intelligence Community and Defense agencies implement innovative cloud computing solutions and solve technical problems? Would you like to do this using the latest cloud computing technologies? Are you the type of person that works with all teams to make operations a better place through simplification of process, creation of automation or building scrappy tools? Would you like to drive the systems operations of the world’s largest scale cloud compute platform? Then this is the job for you. Joining Region Reliability empowers you to drive operational improvements across AWS to delight customers.
Our services operate at large scale with workloads critical to national security. These mission-critical cloud computing solutions require a relentless focus on operational excellence. Given the national security implications of our services, a deep passion for delivering reliable, secure, and high-performing infrastructure is essential. You will share big ideas and execute to deliver the next big innovations at rapid pace. You are a believer in the Dev-Ops model of service engineering, and you are excited to run operations in an environment where developers don’t toss problems over the wall but solve them. And you are happiest when you are working with empowered, world-class engineers to meet world-class challenges. Finally, with your strong ownership bias, you have an infectious desire to continually improve how things are done.
At Amazon, we hire hands-on managers at all levels. This leader must be able to dive deep into the details on business, operations, and engineering and identify how to deliver outcomes where the solution isn’t understood yet. As a Region Reliability Engineering Manager, you will be be familiar with the technical implementation of creating, securing, deploying, maintaining, simplifying, and safely deprecating software through development and production environments. With this perspective, you drive identification of operational excellence improvements and drive adoption across the organization. You will excel at hiring and developing other Amazon Dedicated Cloud Engineers technical skill and review and improve their work. You will grow leaders in your organization to identify manual actions and relentlessly drive improvements through elimination of process, creation of automation, or escalation through owning teams to drive a better experience for our Builders.
We need a technical, detail-oriented operations leader focused on operational excellence to drive best operational practices through the business. You will excel at hiring and developing systems level engineers. You will grow leaders in your organization and lead a charter that grows as your organization grows.
This position requires that the candidate selected must currently possess and maintain an active TS/SCI security clearance. The position further requires that, after start, the selected candidate obtain and maintain an active TS/SCI security clearance with polygraph or commensurate clearance for each government agency for which they perform AWS work.
For inquiries, please reach out to Josh Sacks at sacksjo@amazon.com
10012
Key job responsibilities
In this role you will have the opportunity to:
- Own the continual improvement of operations at AWS.
- Identify Builder operational pain points from disparate data and drive actions to resolve them. Examples includes creation, improvement, simplification, and elimination of processes. Knowing when to create a solution vs. improving existing tools.
- Hire and Develop the Best leaders at AWS by leveraging your technical ability to identify outcomes, design project plans, and drive the right actions to build the technical acumen on the team.
- Collaborate and learn from world-class leaders to meet world-class challenges, every day.
- Work across Region Reliability to root cause operational challenges and
- Hire, train, and grow new region reliability engineers
- Drive continual improvement in systems operations through tool building and automation.
- Drive root cause analyses, in collaboration with software development teams, as well as influencing local development to improve operational performance.
- Report on the health of these services at an executive level.
- Meet with internal and external customers to develop relationships and clarify requirements and schedules.
- Collaborate with and learn from world-class leaders to meet world-class challenges, every day
- Work in an environment where operational excellence is truly the first priority, and where the degree of automation is above bar.
A day in the life
RRE Line managers manage a team of Region Reliability Engineers. A typical day includes working with the team in the SCIF assigning and unblocking workflows, working with partner teams to drive manual task reduction through automation, drive root cause analysis for complex issues, and ensure team SLAs are consistently being met. You will mentor and grow your team, and drive strategic initiatives to generate efficiencies across operational practices.
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Utility Computing (UC)
AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
About the team
Region Reliability is a new organization reinventing the way the AWS operates services on isolated networks. As a technical leader on the team, you will be responsible for operational excellence initiatives to remove manual actions, create and onboard services to automation, and work AWS teams to streamline activities.
Your customers are teams across AWS. Region Reliability collects or defines best in class solutions and applies them across the company. You will be working across teams to apply these best practices, requiring you to effectively influence others.
Leaders are also expected to be proficient with Amazon's and Region Reliability's tooling and processes to deliver for our customers. This enables them to fully understand the work and drive improvements through their team and partner teams.
BASIC QUALIFICATIONS
- Associate's degree, or Cloud+ or GICSP (Global Industrial Cyber Security Professional) or GSEC (GIAC Security Essentials) or SSCP (Systems Security Certified Practitioner)
- 7+ years of relevant hands-on systems engineering and administrative experience in networking, storage systems, operating systems
- 3+ years of experience as the systems engineering and operations leader for an Internet service or leading edge IT organization operating in a 24x7 environment
- Current, active US Government Security Clearance of Top Secret or above
PREFERRED QUALIFICATIONS
- Demonstrated success building and leading teams
- Strong systems engineering fundamentals (networking, storage, operating systems)
- Development experience with a high level language like python, ruby, or java
- Leading development life cycle processes and best practices, esp. in the areas of deployment automation and monitoring
- Agile engineering practices (Kanban, continuous delivery, etc.)
- Mentoring/training systems engineers and systems development engineers
- Experience with distributed systems at scale
- AWS service usage on commercial or government cloud
- Proven ability to influence teams outside of their organization to drive change
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.