Escalation Engineer
You have the opportunity to make a global impact on your daily work. To be part of a close-knit and supportive team that advocates for the best customer experience.
About Us
The Escalation and Event Management (E2M) team, part of Amazon Web Services (AWS) Support organization, is dedicated to managing critical escalations, customer-facing communications, and handling large-scale customer-impacting events.
E2M's purpose is to drive operational excellence and improvements to the overall customer support experience.
Amazon has built a reputation for excellence with a mission to be the earth's most customer-centric company.
AWS continues in that tradition, leading the world in cloud technologies.
About You
E2M is looking for people who are detailed, analytical thinkers as well as creative problem solvers.
You confidently act as an advocate on behalf of AWS customers, maintaining composure and leadership in some very dynamic and high-pressure situations.
You are excited about owning critical infrastructure services that serve global customers, 24x7 and relish the opportunity to work on technical initiatives that drive continuous improvement in the support experience of those customers.
All this while collaborating with some of the smartest people in the industry
About The Role
As part of the E2M 'Event Management' team, we work to identify widespread and/or systemic customer-facing problems for Amazon Web Services.
We are responsible for monitoring internal tools to identify customer-impacting issues.
When an issue is identified, we ensure the appropriate business and technology leaders, and their teams, are engaged to drive the restoration of disrupted AWS services and act as an advocate of the customer to both report on and manage the customer experience.
Because of our unique role as Escalation Engineers, we have limitless exposure to all things AWS, including numerous leading-edge technologies.
Key Job Responsibilities:
* Providing critical incident response/management (including leading calls with internal/external participants) for customer's critical workloads and AWS Service Teams
* Providing concise and timely communication on developing and progressing issues to AWS Support customers, as well as internal stakeholders
* Working to improve important metrics such as 'mean time to engagement' and 'mean time to communication' for all incident types
* Facilitating Root Cause Analysis and Post Event Reviews after each event to minimize recurrence
* Working with key stakeholders across AWS as advocates on behalf of customers to drive improvements in their AWS experience and develop mechanisms that support and improve E2M's ability to deliver on that objective
* Analyzing data trends on internal tickets, customer contacts, social media, and network and infrastructure monitoring to identify potential issues
* Building a broad understanding of AWS architecture and service inter-dependencies
* Designing, building, or collaborating on solutions using automation and self-repair rather than relying on human intervention
BASIC QUALIFICATIONS
* 3+ years of network and operating system support experience
* 3+ years of technical support experience
* 3+ years of information security and compliance experience
* 4+ years of distributed systems experience
PREFERRED QUALIFICATIONS
* Knowledge of distributed computing environments
* Knowledge of security best practices
* Experience with network troubleshooting tools (telnet, test-netconnection, tracert, tracetcp, iperf, ntttcp, dig, and packet capture tools)
* Incident Management