Lead Infrastructure Engineer

NexGen Cloud
Full Time Australia (Remote) Posted 1 week ago
Apply in 1 click

Job Overview

This role involves leading the technical direction for OpenStack and Kubernetes platforms optimized for GPU workloads in a fast-growing cloud infrastructure company. You will own the performance, reliability, and evolution of these platforms while managing a small team of engineers, ensuring scalability to meet global demand for AI, ML, and HPC applications.

Responsibilities

  • Own and drive the design, deployment, and operation of OpenStack and Kubernetes clusters optimized for GPU workloads
  • Lead and develop a team of 4–5 infrastructure engineers, setting clear direction and standards
  • Build and improve infrastructure through automation (IaC, GitOps, CI/CD pipelines)
  • Ensure platform reliability through strong monitoring, observability, and incident management practices
  • Collaborate closely with DevOps, Product, and Support teams to align infrastructure with real-world customer needs
  • Identify opportunities to simplify, standardize, and scale systems as the platform grows
  • Take ownership of operational governance including incident, problem, and change management
  • Communicate clearly with leadership on platform performance, risks, and improvements

Qualifications

  • Strong hands-on experience operating OpenStack in production environments
  • Experience running production-grade Kubernetes clusters (ideally bare metal or private cloud)
  • Solid Linux, networking, and storage fundamentals with a pragmatic troubleshooting approach
  • Experience with infrastructure automation, CI/CD, and Git-based workflows
  • Ability to work in a fast-moving, scale-up environment
  • Proven leadership or mentoring experience within infrastructure/platform teams
  • Experience managing incidents and coordinating response during critical service events
  • Strong communication skills, particularly translating technical issues to non-technical stakeholders