The Job logo
Search

job title or keywords...

(Ctrl+K)

Senior Site Reliability Engineer

ApplyJoin for More Updates

Sign In is required to proceed to job application on company website.

Overview

Azure Cosmos DB is Microsoft’s next generation globally distributed, massively scalable, multi-model cloud database service. It is designed to enable developers to build planet-scale applications. Azure Cosmos DB is one of the fastest growing Azure services. Joining the Azure Cosmos DB team is a fantastic opportunity to work with highly talented engineers operating like a startup, and to deliver on our next set of big challenges.   

As a Senior Site Reliability Engineer, you will identify and deliver software improvements using your expertise in software development, complexity analysis, and scalable system design to ensure services/systems are highly stable, performant, and meeting the expectations of our customers. You will work closely with other engineering teams and provide a holistic view of our cloud service. 

Responsibilities

  • Identify opportunities and drive the design and implementation of end-to-end telemetry, alerting, self-healing and automation capabilities to improve service health, manageability, and reliability. 
  • Participate in on-call rotations and own, triage, investigate and resolve service issues with an emphasis on broad communications, learning & teaching throughout the process. 
  • Interact with customers / support representatives and communicate on a deeply technical level with product engineering and product management teams to evolve services.  
  • Own availability, performance, and supportability targets for the service. 
  • Author functional and technical documentation and remain current on relevant technologies and procedures. 

Qualifications

Knowledge, experience and skills required: 

  • Bachelor's degree in computer science/Engineering/related fields or equivalent industry experience. 
  • 6+ years of experience with writing tools, automation / scripting (Powershell, Python or similar), programming (C++, C# or equivalent) and making enhancements in subcomponents within and around services/products to deliver and manage software in production. Experience aiding understanding of distributed systems and networking is preferred. 
  • 6+ years of troubleshooting/debugging experience: telemetry-based analysis (KQL or equivalent preferred), troubleshooting skills across network, hardware, and distributed service layers, with demonstrated ability to debug, fix, and optimize code. 
  • Good communications skills, both verbal and written.
Microsoft Logo

Company

Microsoft

Job Posted

2 years ago

WorkMode

Hybrid

Experience Level

3-7 years

Locations

Bangalore Urban, Karnataka, India

Qualification

Bachelor

Applicants

Be an early applicant