Job Overview This role involves leading architecture and solution design for AI/ML networking infrastructure, data center, and WAN networking opportunities...
Site Reliability Engineer
Nectar IncJob Overview
This is a short-term, remote contract engagement to validate a practitioner-level skills assessment in Fundamentals of Site Reliability Engineering. Participants will complete the assessment and a brief post-assessment survey, taking approximately 1-2 hours within a 5-business-day window. No teaching, content creation, or ongoing advisory work is involved.
Responsibilities
- Complete a practitioner-level skills assessment for validation and standard-setting purposes.
- Complete a short post-assessment survey providing feedback on the experience.
Qualifications
- Current practitioner with applied, real-world experience in Site Reliability Engineering fundamentals.
- Ability to explain origins, core tenets, and cultural requirements of SRE.
- Knowledge of why SRE implements DevOps principles.
- Skills in designing user-centric Service Level Indicators (SLIs) and achievable Service Level Objectives (SLOs).
- Experience defining and managing Error Budgets and understanding Service Level Agreements (SLAs).
- Familiarity with implementing the Four Golden Signals for monitoring.
- Ability to design vendor-agnostic monitoring and alerting systems.
- Utilization of distributed tracing in microservices and unified telemetry stacks.
- Implementation of incident response using Incident Command System (ICS).
- Conducting blameless post-mortems for systemic improvements.
- Managing toil through automation, using Infrastructure as Code and CI/CD.
- Applying SRE principles for change management and structuring SRE functions.
- Implementing best practices for system reliability, including self-healing systems.
- Managing human impacts in SRE roles and explaining organizational benefits.