logo

View all jobs

Site Reliability Engineer (100% remote)

Lindhurst, NJ · Information Technology
Site Reliability Engineer (remote)

Full Time / Direct-Hire (remote)

Salary:  OPEN


Summary:

Our client offers mentorship and career guidance, a competitive salary, remote-friendly workspace, unlimited vacation time and continuing education support (conferences, books, online resources).

Looking for a Site Reliability Engineer to join an expanding engineering team. This group follows an Agile methodology in small software teams to consistently deliver high-quality software. The stack includes Ruby, Rails, Angular, TypeScript, Node, Rabbit, Solr, Postgres, Redis, Puppet, and Hubot. Infrastructure is declared as code and provisioned on AWS. 

Will gather and analyze metrics from operating systems and applications to assist in performance tuning and fault finding.  Partner with development teams to improve services through rigorous testing and release procedures. Participate in system design consulting, platform management, and capacity planning. Create sustainable systems and services through automation. Balance feature development speed and reliability with well-defined service level objectives. Participate in blameless postmortems to identify resilience and reliability improvements.

Requirements:
  • 2-4 years of software development experience.
  • Experience supporting Linux systems hosted in a cloud environment - they're using AWS (specifically EC2, CloudFormation, RDS, ElasticCache, and S3, to name a few).
  • Experience with web programming languages (Ruby on Rails a definite plus).
  • Familiarity with using Puppet or equivalent infrastructure management & automation tooling.
  • A strong desire to understand complex systems and how to make them highly available.
  • A collaborative spirit and you enjoy working with a team to build things.
  • A desire to continually improve and you value giving and receiving constant and constructive feedback.
Responsibilities:
  • Gathering and analyzing metrics from both operating systems and applications to assist in performance tuning and fault finding.
  • Partnering with development teams to improve services through rigorous testing and release procedures.
  • Participating in system design consulting, platform management, and capacity planning.
  • Creating sustainable systems and services through automation.
  • Balancing feature development speed and reliability with well-defined service level objectives.
  • Participating in blameless postmortems to identify resilience and reliability improvements.
  • Oversight and optimization of AWS infrastructure using configuration management and infrastructure-as-code best practices.
  • Triaging, routing, and resolution of issues and incidents identified by both internal and external stakeholders.
  • Advising and guiding other organizational teams with a focus on automation, maintainability, reliability, performance, and security.
  • Leading, advising, and analyzing load and performance testing exercises to identify performance bottlenecks and breakpoints, and determine infrastructure needs accordingly.
  • Measurement, monitoring, and reporting of availability, latency, and overall system health based on SLIs/SLOs/SLAs.
  • Engagement in capacity planning, demand forecasting, software performance analysis, and systems tuning.
  • Managing the CI/CD pipeline and migration of client software releases through QA, UAT, and production environments to ensure high-quality, on time delivery of all dependencies.
  • Documentation of tribal knowledge to reduce knowledge silos and reliance on institutional memory to support and maintain reliable systems.
  • Triaging and troubleshooting production issues related to the product.
  • Researching and implementing ways to automate the management of our infrastructure and toil.
  • Supporting deployments across our growing development, UAT, and production environments.
  • Participating in blameless postmortems for incidents.
  • Taking part in on-call rotation for production support.

Covenant Consulting strives to attract, cultivate and retain exceptional talent. If you feel you are a match for the position and are interested in a great growth opportunity, we encourage you to contact toliver@covenant-consulting.com. 

Covenant Consulting is a Technology Services Provider offering project-based IT consulting, IT staffing, and IT recruiting services. Every partnership reflects our uncompromising commitment to quality and integrity. We have extensive experience and capabilities in project-based consulting, short and long-term staff augmentation, and permanent recruitment. We work with companies of every size, across many industries and have the flexibility to scale solutions to meet our client's specific needs.

 


 
Powered by