TechOps - Site Reliability Engineer, Azure, Cloud Platform
Job DescriptionJoin us as we pursue our exciting vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people passionate about our product and seeking to deliver the best experience for our customers. At Splunk, we’re committed to our work and customers, having fun and, most significantly, to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!
Role:
Splunk is looking for a TechOps Engineer with the ability to provide day-to-day technical expertise for our Splunk Cloud Azure TechOps team and the Splunk organization. This position is responsible for making key technical decisions that help drive our operational infrastructure that delivers Splunk’s SaaS customer-facing systems. As a TechOps Engineer, you will be collaborating with other multi-functional leaders on key critical initiatives. You will partner with senior engineers to solve difficult problems. You will help grow and mentor the broader operational team and interact with senior leadership to propose solutions. We're looking for someone to bring a fresh approach to problems of all shapes and sizes and help us build a top-notch Splunk Cloud TechOps team. This is a remote role available in Sydney, Australia with an option to support FedRAMP Moderate environments.
You will:
1. Own Splunk Cloud in Microsoft Azure environments and Amazon AWS FedRAMP
2. Work across the organization to deliver quality products that delight Splunk's passionate users.
3. Lead teams of tight-knit engineers who are building an innovative, cloud-based environment for massive-scale data processing.
4. Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
5. Must attain Splunk Cloud Certified Architect within the first 12 months of hire date.
Qualifications:
6. You have experience or an interest in working with regulated computing environments such as FISMA and/or FedRAMP and are enthusiastic about doing it better.
7. Experience working within an Azure environment
8. Experience working in a fully remote position and team
9. You are passionate about building and running distributed systems at scale in production. You understand the challenges and trade-offs involved in building and deploying systems to production.
10. You constantly consider, "How can I automate this process?"
11. Knowledge of best practices related to security, performance, and disaster recovery.
12. Skilled in identifying performance bottlenecks, spotting anomalous system behaviour, and resolving the root causes of incidents.
13. Experience monitoring cloud environments using tools like Splunk, VictorOps and Nagios
14. You care about good documentation and appreciate how it allows a distributed team to function.
15. Ability to take on complex problems, resolve operational issues, and interact with vendors to find solutions.
16. Comfortable working with critical, customer-facing issues and able to prioritize quickly when escalations happen.
17. Deep understanding of Linux systems or equivalent certification (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
18. Must have AZ-900 Azure Fundamentals or preferred AZ-104 Azure Administrator Certification
19. You've demonstrated the skills to effectively collaborate across teams and functions to influence the design, operations, and deployment of highly available software.
20. You are interested in working hard to make the users of Splunk's products happier every day.
21. Ability to work nights, weekends, On-Call and 4x10 Shifts
Preferred skills:
22. Experience monitoring cloud environments with Splunk.
23. Experience with at least one programming language, preferably Golang (go) or Python. Knowledge of working with and automating Linux systems tasks using this language, including working with configuration files and system services. Knowledge of common data structures and algorithms, as well as their performance characteristics, is required.
24. Experience with large-scale distributed cloud service development, infrastructure, traffic management and architecture.
25. Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes.
26. Familiarity with Gitlab, Puppet, Jenkins, Clustering, Web Apps, and yaml
27. Ability to support FedRAMP Moderate environments.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.Note: Splunk provides flexibility and choice in the working arrangement for most roles, including remote and/or in-office roles. We have a market-based pay structure which varies by location. Please note that the base pay range is a guideline and for candidates who receive an offer, the base pay will vary based on factors such as work location as set out below, as well as the knowledge, skills and experience of the candidate. In addition to base pay, this role is eligible for incentive compensation and may be eligible for equity or long-term cash awards.Benefits are an important part of Splunk's Total Rewards package. This role is eligible for a comprehensive, competitive benefits package which may include healthcare and retirement plans, paid time off, wellbeing expense reimbursement, and much more! Learn more about our comprehensive benefits and wellbeing offering. Base Pay Range AustraliaBase Pay: AUD 117,200.00 - 161,150.00 per year