Cloud Operations

Date - JobBoardly X Webflow Template
Posted on:
 
May 2, 2025

Job description

Job Summary

As a Cloud Operations Engineer with a focus on AWS, you will be at the core of Hapana’s infrastructure, ensuring our cloud environments are scalable, secure and high-performing while driving automation and operational excellence. You’ll collaborate closely with engineering teams to embed DevOps best practices, streamline CI/CD pipelines, enhance system monitoring and observability, and optimise cloud architecture for peak efficiency using the full suite of AWS-native services. Your expertise in automation, performance tuning and security compliance will help shape a robust, resilient and high-availability infrastructure that supports rapid product innovation.

Responsibilities

Key Responsibilities

  • Design, deploy and manage AWS infrastructure to support a scalable SaaS environment, leveraging services such as EC2, S3, RDS, Lambda, ECS/EKS, IAM, and VPC.
  • Implement and maintain Infrastructure as Code (IaC) using tools like Terraform or AWS CloudFormation to enable repeatable, automated infrastructure provisioning.
  • Performance Optimisation by continuously enhancing system reliability, scalability and cost efficiency through automation, infrastructure improvements and proactive performance tuning, ensuring optimal resource utilisation and seamless user experiences. Use AWS tools like Trusted Advisor, Compute Optimiser, and CloudWatch for performance insights.
  • Monitoring & Reliability by implementing observability tools for logging, monitoring and alerting, including Amazon CloudWatch, AWS X-Ray, or third-party tools like Datadog or New Relic, ensuring real-time visibility into system health, proactive issue detection and rapid incident resolution to maintain high availability and performance.
  • CI/CD & Automation by designing, implementing and optimising CI/CD pipelines to streamline development workflows, accelerate deployment cycles and ensure seamless integration, automated testing and reliable delivery of software using tools like AWS CodePipeline, CodeBuild, and CodeDeploy.
  • Security & Compliance by enforcing best practices in infrastructure security, access control, vulnerability management and compliance, safeguarding systems against threats while ensuring regulatory adherence and data integrity. This includes using IAM policies, AWS Config, GuardDuty, and Security Hub.
  • Work closely with development teams to optimise applications for cloud-native environments, particularly for containerised deployments using ECS or EKS.
  • Incident Response & Troubleshooting by leading incident management and troubleshooting efforts, ensuring rapid diagnosis, swift resolution and minimal downtime, while continuously improving resilience and response strategies using tools like CloudWatch Alarms and SNS.
  • Establish and maintain backup, disaster recovery and high availability strategies, leveraging services like AWS Backup, Multi-AZ RDS, and Cross-Region Replication.
  • Evaluate and implement new AWS services and technologies to improve efficiency, observability, security posture, and scalability.

Job requirements

Qualifications & Requirements

  • 5+ years of experience in System Administration, DevOps or Cloud Engineering.
  • Strong expertise in cloud platforms (AWS, GCP, or Azure) and container orchestration (Docker, Kubernetes).
  • Strong expertise in AWS services, including EC2, S3, RDS, Lambda, ECS/EKS, and VPC networking.
  • Proficiency in CI/CD tools (BitBucket Pipelines, GitHub Actions, Jenkins, GitLab CI/CD, etc.).
  • Experience with Infrastructure as Code (Terraform, CloudFormation, or similar).
  • Strong scripting skills in Bash, Python, or similar languages.
  • Deep understanding of networking, security best practices, and system monitoring.
  • Experience with logging and observability tools (New Relic, ELK, Prometheus, Grafana, Datadog, CloudWatch etc.).
  • Strong troubleshooting and problem-solving skills in high-availability production environments.
  • Knowledge of DevSecOps, compliance frameworks, and cloud security best practices is a plus.