Rackspace Technology is a leading provider of expertise and managed services across all the major public and private cloud technologies. We’ve evolved Fanatical Support to encompass the entire customer journey — providing Fanatical Experience™ from first consultation to daily operations. Our passionate experts combine the power of proactive, always-on service and expertise with best-in-class tools and automation to deliver technology when and how our customers need it.
We are seeking a dynamic and experienced Operational Lead – Day-to-Day (D2D) Operations with a strong focus on OpenStack, Kubernetes, and Cloud Management Platforms (CMPs). This role requires a hands-on technical leader to ensure seamless operations, service reliability, and high availability of cloud-native and virtualization platforms. The ideal candidate will possess in-depth knowledge of private cloud infrastructure, container orchestration, and enterprise-grade cloud management solutions, along with strong team leadership and incident resolution capabilities.
Discover your inner Racker - click here!
Responsibilities
- Lead daily operations and health checks of OpenStack environments and Kubernetes clusters.
- Manage infrastructure lifecycle, incident response, and routine maintenance of CMP tools and platforms.
- Ensure system availability, reliability, and performance across the private and hybrid cloud ecosystem.
- Coordinate with platform, network, and security teams for infrastructure integration and operational support.
- Proactively monitor environments, manage alerts, and troubleshoot platform issues end-to-end.
- Implement and refine SOPs, automation scripts, and best practices for Kubernetes and OpenStack operations.
- Drive RCA (Root Cause Analysis) and problem management for recurring issues and major incidents.
- Collaborate with DevOps and CI/CD teams to align infrastructure and application deployment pipelines.
- Maintain and report on service metrics, availability, and SLA/KPI adherence.
- Guide and mentor platform operations engineers to ensure continuous skills development and knowledge sharing.
- Participate in on-call rotations and provide escalation support for critical issues.
Requirements
- Bachelor’s degree in computer science, Information Technology, or related discipline.
- 8+ years of IT infrastructure experience, including 3+ years in a cloud operations or platform engineering leadership role.
- Strong hands-on experience with OpenStack administration and operations.
- Proficiency with Kubernetes (K8s) cluster operations, deployments, upgrades, and security.
- In-depth understanding of CMP tools (e.g., Morpheus, VMware Aria, Red Hat CloudForms, OpenNebula).
- Knowledge of Linux system administration, networking fundamentals, and storage integrations.
- Familiarity with container technologies (Docker, CRI-O, containerd) and container runtime management.
- Working knowledge of service mesh, ingress controllers, and cloud-native observability (Prometheus, Grafana, Fluentd, etc.).
- Strong experience with automation and infrastructure-as-code (Ansible, Terraform, Helm).
- Solid grasp of ITIL processes, incident and change management, and high-availability design principles.
- Excellent problem-solving skills and communication abilities.
Additional Skills (Good to have)
- Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)
- OpenStack certification or equivalent enterprise experience
- Experience with hybrid cloud integration (e.g., OpenStack with Azure/AWS/GCP)
- Hands-on experience with GitOps, ArgoCD, or Flux
- Experience working in DevSecOps environments
- Familiarity with security controls and compliance in containerized environments
- ITIL Foundation certification
#LI-LP2
الإبلاغ عن وظيفة