We are looking for a highly skilled Site Reliability Engineer (SRE) with expertise in Data Engineering and PySpark to join our dynamic team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our data infrastructure and applications.
Key Responsibilities Monitor & Maintain the health, performance, and availability of data processing systems and infrastructure.Collaborate with data engineers, software developers, and stakeholders to ensure seamless integration and deployment of data solutions.Automate & Optimize system reliability and efficiency through scripting and tooling.Troubleshoot & Resolve issues related to data processing, infrastructure, and application performance.Implement Best Practices for data security, retention, backup, and recovery.Provide Production Support, including smoke checks, incident management, and change control.Follow ITIL Framework to ensure adherence to IT key controls and compliance measures.Conduct Root Cause Analysis to identify and implement corrective actions for recurring issues.Maintain Documentation, runbooks, and troubleshooting guides for data systems and processes.Support Data Projects such as system migrations, upgrades, and expansions.Participate in On-Call Rotations, including weekend and holiday support, as needed.Key Focus Areas Automate failure detection & handling to minimize downtime.Enhance system reliability through proactive monitoring and mitigation strategies.Reduce manual efforts by identifying opportunities for automation.Ensure disaster recovery readiness with well-defined plans.Support & optimize Spark platforms for performance and scalability.What do I need? Education Bachelor's degree in computer science, Data Science, or a related field.
Experience Proven track record as an SRE or in a similar role within Data Engineering, particularly in managing Spark platforms.
Technical Expertise Proficiency in PySpark and related big data technologies (e.g., Spark, Hadoop, Hive).Strong understanding of data pipelines built using PySpark.Debugging expertise for Spark issues at both platform and application levels.Experience with data processing orchestration and scheduling tools.Good knowledge of the Spark ecosystem and distributed computing principles.Strong Linux, networking, CPU, memory, and storage fundamentals.Soft Skills Excellent problem-solving skills with a keen eye for detail.Strong communication and collaboration abilities.Ability to work efficiently under pressure in a fast-paced environment.What's it like to work there? We are a collaborative team of passionate people with a shared ambition to make a difference for our customers, our communities and each other. At Westpac, making a difference means creating impact, unlocking our own and each other's passions, and transformative success stories to create better futures together.
As well as competitive remuneration and a great culture, joining the Westpac family gives you access to a wide range of employee benefits to help you manage your priorities - whether that means family life, work/life balance, ambition to grow or all the little perks in between.
We'll empower you to shape your career path. Through personalised upskilling, mentoring, and training opportunities, you're in control of where you start and how you'll grow.
As an equal opportunity employer, we are proud to have created a culture and work environment that values diversity and flexibility – and champions inclusion.
We invite candidates of all ages, genders, sexual orientation, cultural backgrounds, people with disability, neurodiverse individuals, veterans and reservists, and Indigenous Australians to apply.
Do you need reasonable adjustments during the recruitment process?
#J-18808-Ljbffr