DrFirst

Returning Candidate?

Site Reliability Engineer

Site Reliability Engineer

Job ID 
2018-1468
# of Openings 
2
Job Locations 
US-MD-Rockville
Posted Date 
2/27/2018
Category 
Software Development

More information about this job

Overview

Life is short, work somewhere awesome.

 

The SRE team are the in-house experts on building reliable and maintainable systems. They plan infrastructure capacity to accomplish High Availability and uptime goals for all of DrFirst products.

 

The DevOps/ Site reliability team eliminates inefficiencies and incompatibilities which jeopardize service availability to deliver a reliable and scalable software service to DrFirst’s clients. Key aspects of this role include automation, configuration management and tools development while collaborating with the engineering team on projects/products as expert on reliability, performance and efficiency.

   

Responsibilities

As a part of the Systems team, you will:

  • Periodically assess all monitoring requirements and implement necessary enhancements to meet changing/growing business needs
  • Enhance current automation processes of managing capacity, safely deploying software and mitigating failures
  • Tune and troubleshoot full-stack software applications using OOPS, Java, web-services, Oracle DB, Mongo DB, networks concepts and virtualization techniques
  • Proactively review, recommend and implement changes to the live infrastructure after ensuring the right validation has been carried out
  • Assist in roll out and deployment of new product features and installations to facilitate rapid iteration.
  • Confidently make informed, data-driven decisions in a fast-paced environment with competing priorities
  • Create and maintain Chef recipes for instance configuration management
  • Participate in 24/7 on call rotation and after hours deployment

Qualifications

To be successful in this role, you must have:

  • Bachelor’s degree in Computer Science or a related discipline (Master’s preferred)
  • At least 3 years’ coding experience (with Java, JavaScript, Ruby on Rails, or Python)
  • Experience with production releases, maintenance, and monitoring
  • Experience with build tools, orchestration tools, and virtual machine frameworks
  • Skills with log analysis and troubleshooting
  • Strong background in Unix/Linux administration
  • Experience with automation/configuration management using Chef, Puppet, or equivalent
  • Ability to use a wide variety of open source technologies and cloud services (AWS required)
  • Understanding of coding and scripting with Shell, PowerShell, Python, Ruby and/or Perl
  • Strong experience with SQL and MySQL (NoSQL is a big plus)
  • Knowledge of best practices and IT operations in a 24/7 environment
  • Self-motivate and technically curious
  • Ability to work independently and prioritize effectively

Nice to Have:

  • Ability to perform merging, branching and configuration management of SCM systems
  • 2+ years’ experience with Maven and/or Gradle
  • Experience with Source control management tools like SVN and Git

Benefits

We offer comprehensive benefits to keep you healthy and happy as you grow in your life and career, and your merit-based compensation will reflect the impact your work has on the company and our customers.