
חדש באתר! העלו קורות חיים אנונימיים לאתר ואפשרו למעסיקים לפנות אליכם!
We are seeking a Site Reliability Engineer (SRE) to join our team responsible for critical infrastructure
Key Responsibilities
Operate and manage the Kubernetes platform (RKE2) platform
Set up and manage CI/CD pipelines using tools like Jenkins Argo CD and others
Implement Infrastructure as Code (IaC) and infrastructure automation
Design and maintain monitoring systems using Prometheus, Grafana and Elastic
Develop internal monitoring and quality control tools
Analyze incidents, trends and availability using SQL
Build Self – Service capabilities for development teams
Participate in on – call rotations, handle production incidents, and lead documented post- mortems
Write and Maintain operational documentation
Requirements
Familiarity with TCP/IP and communication protocols
Advanced Networking Knowledge: Deep understating of TCP/IP, network protocols (UDP, HTTP, HTTPS, DNS, DHCP) and network security
Experience with Kubernetes in production environment
Experience with Argo CD, Jenkins and other CI/CD tools
Experience with monitoring and logging tools (Prometheus, Grafana, Elastic)
Development experience in Python and Knowledge of SQL
Strong Infrastructure understanding and Troubleshooting skills in complex environments