DevJobs

Senior DevOps Engineer

Overview
Skills
  • Bash Bash
  • Python Python
  • Go Go
  • Redis Redis
  • MySQL MySQL
  • MongoDB MongoDB
  • Linux Linux
  • GitHub Actions GitHub Actions
  • Jenkins Jenkins
  • AWS AWS
  • GCP GCP
  • Kubernetes Kubernetes ꞏ 4y
  • Terraform Terraform
  • Grafana Grafana
  • EKS ꞏ 4y
  • Datadog
  • Prometheus Prometheus
  • Crossplan
  • CloudWatch
  • ClickHouse
Description

We are looking for a Senior DevOps Engineer to play a critical role in maintaining and evolving our highly available, high-traffic platform. In this role, you will ensure our systems are robust, scalable, and efficient - capable of seamlessly handling significant traffic spikes while meeting current and future business challenges. Your expertise in cloud infrastructure, container orchestration, automation, monitoring, and performance optimization will be essential.

As a Senior DevOps Engineer, you will be an integral part of our team responsible for managing production and development environments, designing CI/CD pipelines, and enforcing best practices in reliability, security, and scalability.

Responsibilities

  • Design and build efficient, resilient cloud architectures using EKS and associated AWS services to address evolving business needs.
  • Develop, manage, and maintain production, staging, and development environments that are resilient, secure, and scalable while improving service-level agreements (SLAs) and performance.
  • Manage, monitor, and scale our distributed, containerized applications to ensure high availability and optimal performance.
  • Work closely with engineering and R&D teams to translate product requirements into scalable, secure technical solutions.
  • Mentor team members and foster an environment of continuous learning and technical excellence.
  • Establish, manage, and optimize CI/CD pipelines using tools such as Jenkins, GitHub Actions, Helm charts, Terraform and Crossplan.
  • Implement automation strategies to streamline deployments and infrastructure management, ensuring rapid yet reliable release cycles.
  • Implement robust monitoring and logging solutions (e.g., Prometheus, Grafana, CloudWatch, Datadog) to proactively identify and address performance bottlenecks.
  • Lead incident response efforts, conduct root-cause analysis, and drive long-term solutions to prevent recurring issues.

Requirements

  • 7+ years of experience as a DevOps Engineer or Site Reliability Engineer in high-scale, fast-paced environments.
  • 4+ years of hands-on experience with Kubernetes, with a strong emphasis on EKS.
  • Proven experience with cloud infrastructure platforms, particularly AWS (experience with GCP is a plus).
  • Proficient in using Infrastructure as Code (IaC) tools such as Terraform and Crossplan.
  • Strong knowledge of CI/CD systems such as GitHub Actions and Jenkins, and related automation tools.
  • Knowledge of different database technologies like MySQL, MongoDB and Redis.
  • Extensive hands-on experience with Linux operating systems, Bash scripting, and networking fundamentals.
  • Deep understanding of container orchestration, microservices architecture, and distributed systems.
  • Experience with observability and monitoring tools such as Prometheus, Grafana, CloudWatch, and Datadog.
  • Deep understanding of security best practices and experience leading security initiatives within cloud environments.

Advantages

  • Experience with ClickHouse.
  • Proficiency in Golang and Python.
  • Background in building and optimizing ad-serving or similarly high-traffic platforms.
Minute Media