Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Site Reliability Engineer image - Rise Careers
Job details

Site Reliability Engineer

Overview

The Site Reliability Engineer is responsible for ensuring the reliability, availability, and performance of the company’s information technology systems and infrastructure.

 

This is a highly skilled role that bridges the gap between development and operations to optimize system performance, and therefore requires a proactive mindset to drive innovation and collaborate to set strategy for system-wide improvements.

 

In this role, you will work with the development and operations teams to design, build, and maintain scalable and robust infrastructure, automate processes, and troubleshoot and resolve incidents while providing long-term solutions.

 

This role is a strategic partner, capable of recognizing and analyzing trends, identifying opportunities and aligning initiatives with organizational goals that directly impacts the stability and efficiency of the company’s production environment, driving continuous improvement and resilience across the organization.

Responsibilities

• System Reliability: Ensure high availability, performance, and scalability of production systems and infrastructure.• Monitoring & Alerting: Design, implement, and maintain monitoring tools, alerts, and dashboards to increase visibility of system performance and to proactively detect and resolve issues before they impact users.• Strategic Partner: Champion forward-looking strategies that anticipate industry trends and position the company for long term success. Translate high-level vision into actionable roadmaps and measurable outcomes. Collaborate with cross-functional teams to define and establish service level objectives (SLOs) and service level agreements (SLAs) for critical systems.• Performance Optimization: Identify bottlenecks and optimize systems and services to improve latency, throughput, and resource usage. Perform capacity planning and resource allocation to ensure optimal system performance and scalability. Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability and performance standards.• Automation & Tooling: Develop automation for routine tasks, deployments, and infrastructure management to reduce manual work and improve reliability.• Troubleshooting & Diagnostics: Analyze and resolve critical incidents and problems, including system failures, performance issues, and security breaches.• Incident Management: Respond to Level 3 system outages and performance issues; lead post-incident reviews and implement preventative measures. • Root Cause Analysis: Perform in-depth analysis of recurring issues and provide permanent preventative solutions to reduce future incidents.• Documentation: Create and maintain technical documentation, including troubleshooting guides, procedures, and knowledge base articles and champion the transition to Operations teams.• Continuous Learning: Stay up to date with industry best practices, new technologies, and emerging trends in site reliability engineering.

Qualifications

Minimum Education Required:• Bachelor’s Degree, IT, Computer Science, or related field; or comparable work experience.   Minimum Experience  Required:• 10+ years of progressive experience in IT support roles, including hands-on experience in Level 2 support or system/network administration, with significant involvement in incident response, root cause analysis, and driving improvements to system resilience and observability.• 3+ years of experience in Site Reliability Engineering, DevOps, or a related role, with a strong focus on supporting production systems, and experience with monitoring and logging tools such as Azure Monitor, Datadog and PRTG or equivalent.• Significant experience with supporting infrastructure technologies such as Windows Server, Active Directory, Microsoft 365, networking, virtualization (e.g., VMware/Hyper-V), and/or Azure or AWS cloud platforms.• Experience leading and coordinating tasks with multiple teams/departments and with multiple users.• Experience in trend analysis, identifying process improvement opportunities, and providing recommendations in alignment with business goals.• Experience implementing security measures in a production environment.  Preferred:• Proven experience as a Site Reliability Engineer or equivalent role • Experience with agile and iterative development processes.• HIT Experience  Knowledge Skills and Abilities• Strong problem-solving and troubleshooting skills with the ability to analyze and resolve complex technical issues with a focus on continuous improvement and automation.• Excellent communication and collaboration skills to work effectively with cross-functional teams• Proficiency in scripting languages such as PowerShell• Solid understanding of software development methodologies and Dev Ops principles.• Advanced understanding of networking principles and protocols.• Expertise in monitoring and logging tools such as Datadog and PRTG.• Knowledge of containerization technologies and orchestration tools.• Knowledge of security best practices • Ability to use and educate on a variety of processes including documentation, automation, change management, standardization• Ability to use and educate on a variety of technology including Cloud services (Azure and or AWS), Active directory, Office 365, MS Teams, Active Directory, Azure SSO, fileservers, clustering and network administration, and other current technologies• Familiar with ITSM processes and methodologies• Working knowledge of multi-tier architectures: load balancers, caching, web servers, application servers and databases.• Ability to effectively prioritize and execute tasks with strong attention to detail in a high-pressure environment.• Skilled at handling multiple projects simultaneously, at times working independently and at times within a team-oriented, collaborative environment.• Ability to translate requirements to technical needs.   Licenses and Certifications Required:• N/A Preferred:• ITIL 4 certification• Certification in relevant technologies or frameworks is a plus (e.g. AWS Certified Dev Ops Engineer)

Virtual Employee?

Yes

Salary Range

$110,000 - $150,000

Location/Org Data : Dept Number

CORPIL

Average salary estimate

$130000 / YEARLY (est.)
min
max
$110000K
$150000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User

Athletic Trainer role at ATI supporting athletes in a collaborative, balanced environment with professional development opportunities.

Photo of the Rise User

Physical therapists can advance their careers with ATI's structured travel programs, enjoying competitive pay, full benefits, mentorship, and professional development in Marietta, GA.

Photo of the Rise User

Tyto Athene seeks a skilled Mobile Device Management Engineer to manage and support MDM solutions, ensuring mobile device compliance and security for U.S. Southern Command.

Photo of the Rise User
Posted 2 days ago

Senior AV Engineer needed at Tyto Athene to design, maintain, and support critical AV infrastructure in a mission-focused government environment.

Photo of the Rise User
Posted 12 days ago

Serve Arizona’s leading law enforcement agency as a 3rd Shift IT Help Desk Specialist, delivering technical support and maintaining critical IT operations.

Lead and manage MUFG’s Global Security Operations Center team in Tampa to ensure continuous cybersecurity monitoring and incident response in a dynamic hybrid work setting.

Photo of the Rise User
Sur Hybrid No location specified
Posted yesterday

Seeking an experienced Site Reliability Engineer to enhance and maintain a secure SaaS platform’s infrastructure, ensuring uptime, performance, and efficient incident response.

Photo of the Rise User
Posted 3 hours ago

HealthVerity is hiring a Senior Security Engineer to lead security operations and development of security solutions in a hybrid role based in Philadelphia.

Photo of the Rise User

Seeking a Hybrid Cloud Infrastructure Engineer to lead Microsoft Azure-based cloud architecture and automation at a growing healthcare education company.

Photo of the Rise User
Posted 12 days ago

National Debt Relief seeks a knowledgeable IT Support Analyst II to provide expert technical support and mentorship to their remote technology team.

Photo of the Rise User
Expeditors Hybrid 1605 Lyndon B Johnson Freeway, Farmers Branch, TX, United States
Posted 13 days ago

Expeditors is hiring an Associate Systems Administrator to support enterprise infrastructure and DevOps initiatives in an onsite, collaborative Agile environment.

Photo of the Rise User
Southern Company Hybrid Atlanta, Georgia, United States
Posted 7 days ago

Southern Company is looking for an experienced End User Analyst II to support and enhance technology solutions for business partners in the Atlanta area.

Photo of the Rise User
Qualtrics Hybrid Seattle, Washington, United States
Posted 13 days ago

Qualified candidates will drive the administration and optimization of financial systems at Qualtrics, supporting critical ERP and expense management functions.

Photo of the Rise User
Posted yesterday

Stride, Inc. is looking for a skilled Security Architect to lead secure architecture initiatives and risk assessments in a fully remote role.

HealthPartners/GHI Hybrid Bloomington, Minnesota, United States
Posted yesterday

Lead enterprise architecture and drive digital transformation at HealthPartners as a Director with deep healthcare and technical expertise.

To exceed customer expectations by providing the highest quality of care in a friendly and encouraging environment.

10 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
July 22, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!