Sr. SRE Consultant Job at Cloud BC Labs, Seattle, WA

WlBZY0FIdlBZMnBkOEd1cWl3WkwrdHFCaUE9PQ==
  • Cloud BC Labs
  • Seattle, WA

Job Description

Role: Sr. SRE (Very Strong Technical SRE)

Location: Seattle, WA

WFO: Mandatory (3 days/week)

Short JD:

  1. Job Summary/Role Description (General Information)
  • As a Senior Site Reliability Engineer, you will play a critical role in supporting application developers and Operations personnel by providing expert guidance on Application and infrastructure best practices from a reliability perspective.
  • Your primary focus will be Observability, toil reduction through automation , and bringing in reliability with an emphasis on solving operations issues.
  • Must have at least 5+ years of SRE experience in large programs with a focus on toil reduction, implementation of full-stack observability, and reduction of MTTD and MTTR .
  • Must have a good understanding of Site Reliability Engineering (SRE) principles and practices.
  • Should be a strong team player and enjoy collaborating with different teams, as well as share knowledge and strive for continuous improvement self and team.

  1. Core Skills/Technical Requirements
  • Experience with scripting in Python, PowerShell, Bash, Shell, Perl (any one of these).
  • Strong experience on one or more Observability tools like Splunk, AppDynamics, Dynatrace, Datadog .
  • Experience in Observability Dashboard creation, Synthetic Monitoring, and Real User Monitoring (RUM).
  • Experience working on tools like Remedy, ServiceNow, Confluence, Jira .
  • Experience in ITSM process including Incident, Problem, and Change management.
  • Experience in setting up Service Map/ Distributed Traces in the Observability tool (good to have)
  • Knowledge of operating systems like Linux/Windows, including understanding of networking. (good to have)
  • Experience in software architecture, distributed systems, and development languages like Java or .Net. (good to have)

  1. Soft Skills/Other Requirements
  • Should possess strong analytical, troubleshooting, and problem-solving skills.
  • Excellent communication skills along with leadership skills.

  1. Key Responsibilities/Duties
  • Drive the reliability and performance of client's critical services.
  • Drive system reliability and stability through proactive monitoring and automation.
  • Implement observability frameworks and SRE best practices, including the setup of SLO/SLI .
  • Define error budget as per the SLO.
  • Drive a metrics-driven culture using data to measure overall system quality and reliability.
  • Provide primary operational support and engineering for client's critical services.
  • Manage and participate in on-call incidents.
  • Work with users/Ops team to understand issues, develop root cause analysis, and work with the development team for permanent fixes.
  • Working on setting up service maps / distributed traces to visualize the entire workflow and analyze the cause of problems/incidents.
  • Define, evangelize, and maintain SRE best practices.
  • Improve automation, including system's self-healing capability.

Cloud BC Labs Inc is a digital transformation organization aimed at creating seamless solutions for clients to effectively manage their business operations. The company specializes in Business and Management Consulting, AI/ML, Data Analytics & Visualization, Cloud Data Warehouse Migration, Snowflake Implementation, Informatica Implementation & Upgrade, Staffing Services and Data Management Solutions

Job Tags

Permanent employment, 3 days per week,

Similar Jobs

Oldcastle Infrastructure

Carpenter Job at Oldcastle Infrastructure

Job ID: 510917 Non-Exempt Oldcastle Infrastructure, a CRH company, is the leading provider of utility infrastructure solutions for the water, energy, and communications markets throughout North America. Were more than just a manufacturer of precast concrete, ...

Motion Recruitment

Desktop Support Analyst Job at Motion Recruitment

 ...Our Client, a data analytics and risk firm , is looking for someone to join their team as an Desktop Support Analyst!**This is a hybrid 6-month contract that takes place in Jersey City, New Jersey** What You Will Be Doing Troubleshoot and replace laptop and... 

Tyndale Advisors, LLC

Creative Services Manager Job at Tyndale Advisors, LLC

 ...Job Description We are seeking an experienced and innovative Creative Services Leader to lead and elevate the 13-member creative team supporting marketing strategies at Tyndale Advisors to deliver a premium end product to the clients we serve. This Eugene-based... 

KamisPro

Epic Beaker Analyst Job at KamisPro

 ...Job Title: Epic Clinical Analyst, Beaker Level: Entry-Level or junior Overview: This role supports the development, implementation, and maintenance of Epics clinical software, specifically the Beaker lab module. Youll help improve healthcare workflows by working... 

GQR

Sr. Clinical Research Associate Job at GQR

 ...are executed appropriately. Participates in protocol development, tracking changes for future protocol amendments, and works with medical writing to initiate protocol amendments as needed. Oversees aspects of study management and vendors (e.g. IRT, eCOA, cardiac...