Service Reliability Engineer

  • Australia
  • Sydney
  • Permanent
  • AU$180000 - AU$190000 per annum

Location: Sydney (4 days in office, 1-day WFH)
Reports to: Technical Operations Director, APAC
Department: Global Technical Operations

The Opportunity:

A leading music organisation is now growing their Global Technical Operations hub in Sydney and looking for a Service Reliability Engineer (SRE) to join their team.

This is more than a traditional ops role – it’s an opportunity to bring a software engineering mindset to reliability, automation, and scalability in a global, high-impact environment.

What You’ll Do:

You’ll join a collaborative, hands-on team responsible for the stability, performance, and scalability of global platforms. Working closely with development, infrastructure, and security teams, you’ll help build a resilient environment that keeps music flowing – from studio tools to streaming systems.

  • Design and maintain high-availability, high-performance systems for global applications.
  • Automate everything – from infrastructure provisioning to deployment and scaling – using tools like Terraform, Ansible, and Python.
  • Build robust monitoring and observability frameworks with AWS CloudWatch, Dynatrace, Prometheus, Grafana, or Splunk.
  • Optimize CI/CD pipelines to improve reliability and deployment speed.
  • Participate in on-call rotations, troubleshoot incidents, and lead post-incident reviews.
  • Champion SRE principles – embed SLOs, SLIs, and error budgets into everyday engineering.
  • Collaborate across Dev, Infra, and Security teams to create a culture of continuous improvement and reliability.

About You

You’re a technically strong and level-headed engineer who loves automation, thrives in complex environments, and knows how to balance pragmatism with perfection.

  • Background in systems administration (Linux/Windows) in a large-scale environment.
  • Proficient in at least one programming language (Python, Go, or Java).
  • Hands-on experience with AWS (GCP or Azure a bonus).
  • Deep understanding of networking, containers (Docker/Kubernetes), and Infrastructure as Code (Terraform, Ansible).
  • Experience with monitoring and observability tools such as Dynatrace, Prometheus, Grafana, or Datadog.
  • Calm, collaborative communicator with strong analytical and problem-solving skills.

Bonus Points For:

  • Experience with ServiceNow or ITIL processes.
  • Knowledge of chaos engineering, resilience testing, or advanced capacity planning.
  • Previous experience managing distributed, global systems in production.

Culture & Perks

  • Early Friday finish (1pm)
  • Annual bonus $
  • Optional 1% additional super with MLC
  • Global collaboration and career growth opportunities

Interested?
Apply now or contact Sophia Parrelli at Talent International for a confidential chat.

Apply now

Submit your details and attach your resume below. Hint: make sure all relevant experience is included in your CV and keep your message to the hiring team short and sweet - 2000 characters or less is perfect.