Пропустити навігацію EPAM

Lead Site Reliability Engineer (SRE) Chennai, India

  • hot

Lead Site Reliability Engineer (SRE) Description

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

We are seeking a talented and motivated Lead Site Reliability Engineer (SRE) to join our organization.

The Lead SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.


#LI-DNI#EasyApply

Responsibilities

  • Design, build, and maintain scalable and reliable cloud infrastructure and services on platforms such as AWS, Azure, or Google Cloud
  • Automate manual work using scripting/programming languages like Python, Bash, or PowerShell, particularly within cloud environments
  • Utilize automation tools like Jenkins, GitLab, and Ansible/Chef to streamline deployment, monitoring, and management of systems and applications in the cloud
  • Monitor system performance, proactively troubleshoot issues, and ensure high availability and performance using Observability tools like Prometheus, Grafana, or ELK stack
  • Participate in capacity planning and scalability assessments to support business growth, focusing on cloud resource optimization
  • Implement containerization and orchestration technologies such as Docker and Kubernetes, particularly in cloud-native environments
  • Ensure compliance with security best practices and standards to safeguard data and systems in the cloud
  • Continuously evaluate and recommend new technologies and practices to improve system reliability, performance, and efficiency in the cloud
  • Document processes, procedures, and configurations to maintain system integrity and facilitate knowledge sharing
  • Provide on-call support and participate in incident management & response activities as needed

Requirements

  • 8-13 years of experience in a similar role
  • Prior leadership experience or team management skills
  • Experience with cloud platforms like AWS, Azure, or Google Cloud
  • Proficiency in scripting/programming languages such as Python, Bash, or PowerShell
  • Experience with automation tools like Jenkins, GitLab, and Ansible/Chef
  • Strong communication and collaboration skills
  • Experience with Observability tools such as Prometheus, Grafana, ELK stack, or similar
  • Hands-on experience with Docker, Kubernetes, or similar technologies
  • Knowledge of security practices and standards in cloud environments
  • Experience with SLI, SLO, SLA, and Error Budget concepts
  • Strong problem-solving skills and ability to troubleshoot complex issues under pressure
  • Familiarity with Agile methodologies and DevOps practices
  • Excellent documentation skills

Nice to have

  • Certifications in cloud technologies (AWS, Azure, Google Cloud)
  • Contributions to open-source projects

We offer

  • Opportunity to work on technical challenges that may impact across geographies
  • Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
  • Opportunity to share your ideas on international platforms
  • Sponsored Tech Talks & Hackathons
  • Unlimited access to LinkedIn learning solutions
  • Possibility to relocate to any EPAM office for short and long-term projects
  • Focused individual development
  • Benefit package:
    • Health benefits
    • Retirement benefits
    • Paid time off
    • Flexible benefits
  • Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

ПРИВІТ! ЯК МИ МОЖЕМО ВАМ ДОПОМОГТИ?

НАШІ ОФІСИ