5670 Wilshire Boulevard, 22nd Floor
Job Category: DevOps
Job Number: 19387
Site Reliability Engineer
You will be responsible for:
- Handling multiple initiatives, balancing your time between them
- Learning new technologies, with a passion for open source and the wisdom to keep the bleeding edge out of production
- Collaborating with development teams to meet each other’ s requirements in an agile, rapidly increasing infrastructure.
- Improving automated deployments, monitoring, management, and incident response.
- Taking action to get our HA production environments to " just work" without manual intervention or midnight alerts.
You bring to the table your:
- 3+ years of experience as a Systems or DevOps Engineer
- Solid understanding of Linux Engineering / Administration
- Excellent understanding of modern DevOps technologies, methodologies, and processes.
- Experience with a majority of the following: Openstack or similar, LAMP stack, GO or Python or Ruby, Sensu or Nagios, Logstash (ELK), F5 products, MySQL or MongoDB.
- Experience with Chef (creating recipes and cookbooks from scratch)
- Working knowledge of Continuous Delivery / Release Automation models.
- Working knowledge of AWS services
- Unwillingness to let technical problems go unsolved.
- Experience working in 24/7 operational environments, with rotating on-call responsibilities.
- Prior experience managing servers in a single HA environment and/or multiple geographically separated server cluster.
Additional experience that will set you apart, but not required:
- Experience with Kubernetes, Docker, and Helm in production environment.