Sr. SRE Consultant Job at Cloud BC Labs, Seattle, WA

WlBZY0FIdlBZMnBkOEd1cWl3WkwrdHFCaUE9PQ==
  • Cloud BC Labs
  • Seattle, WA

Job Description

Role: Sr. SRE (Very Strong Technical SRE)

Location: Seattle, WA

WFO: Mandatory (3 days/week)

Short JD:

  1. Job Summary/Role Description (General Information)
  • As a Senior Site Reliability Engineer, you will play a critical role in supporting application developers and Operations personnel by providing expert guidance on Application and infrastructure best practices from a reliability perspective.
  • Your primary focus will be Observability, toil reduction through automation , and bringing in reliability with an emphasis on solving operations issues.
  • Must have at least 5+ years of SRE experience in large programs with a focus on toil reduction, implementation of full-stack observability, and reduction of MTTD and MTTR .
  • Must have a good understanding of Site Reliability Engineering (SRE) principles and practices.
  • Should be a strong team player and enjoy collaborating with different teams, as well as share knowledge and strive for continuous improvement self and team.

  1. Core Skills/Technical Requirements
  • Experience with scripting in Python, PowerShell, Bash, Shell, Perl (any one of these).
  • Strong experience on one or more Observability tools like Splunk, AppDynamics, Dynatrace, Datadog .
  • Experience in Observability Dashboard creation, Synthetic Monitoring, and Real User Monitoring (RUM).
  • Experience working on tools like Remedy, ServiceNow, Confluence, Jira .
  • Experience in ITSM process including Incident, Problem, and Change management.
  • Experience in setting up Service Map/ Distributed Traces in the Observability tool (good to have)
  • Knowledge of operating systems like Linux/Windows, including understanding of networking. (good to have)
  • Experience in software architecture, distributed systems, and development languages like Java or .Net. (good to have)

  1. Soft Skills/Other Requirements
  • Should possess strong analytical, troubleshooting, and problem-solving skills.
  • Excellent communication skills along with leadership skills.

  1. Key Responsibilities/Duties
  • Drive the reliability and performance of client's critical services.
  • Drive system reliability and stability through proactive monitoring and automation.
  • Implement observability frameworks and SRE best practices, including the setup of SLO/SLI .
  • Define error budget as per the SLO.
  • Drive a metrics-driven culture using data to measure overall system quality and reliability.
  • Provide primary operational support and engineering for client's critical services.
  • Manage and participate in on-call incidents.
  • Work with users/Ops team to understand issues, develop root cause analysis, and work with the development team for permanent fixes.
  • Working on setting up service maps / distributed traces to visualize the entire workflow and analyze the cause of problems/incidents.
  • Define, evangelize, and maintain SRE best practices.
  • Improve automation, including system's self-healing capability.

Cloud BC Labs Inc is a digital transformation organization aimed at creating seamless solutions for clients to effectively manage their business operations. The company specializes in Business and Management Consulting, AI/ML, Data Analytics & Visualization, Cloud Data Warehouse Migration, Snowflake Implementation, Informatica Implementation & Upgrade, Staffing Services and Data Management Solutions

Job Tags

Permanent employment, 3 days per week,

Similar Jobs

LMiadvertising

Marketing Intern - Sports Minded Job at LMiadvertising

 ...? Do you have a strong work ethic, discipline, and a winning mentality? We're seeking talented and driven individuals to join our marketing team! About Us: We're a fast-paced and innovative marketing company that values teamwork, positivity, and growth. As a former... 

Southwestern Advantage

Sales Intern Job at Southwestern Advantage

Southwestern Advantage is seeking college students from all majors for a paid summer work/internship position. This is an outside sales internship where students will learn how to run their own business and be effective in creating relationships with consumers. Students...

Midwest Logistic Systems

Part-time Midwest Logistics Systems Dedicated truck driver Job at Midwest Logistic Systems

 ...(MLS) is operated separately from Schneider with its own operating procedures, qualification requirements, training program, pay packages and benefits. If you have any questions about becoming an MLS driver, please contact an MLS recruiter at (***) ***-****. Job MLS... 

Headway Technologies

Facilities Technician Job at Headway Technologies

 ...for the scheduled and unscheduled maintenance and repair of all industrial equipment, including HVAC packaging units, chiller/cooling tower systems, boilers, fluid pumping systems, reverse osmosis and DI water systems, waste water treatment, and chemical delivery systems;... 

FootBridge

Quality Control Manager Job at FootBridge

 ...The successful candidate must possess the requisite experience that will permit them to be the Construction Quality Control Manager on the companys government construction projects. Prior experience working on federal government projects is required. RESPONSIBILITIES...