Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Site Reliability Engineer image - Rise Careers
Job details

Senior Site Reliability Engineer

GOVX is seeking an experienced Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our production systems through automation, observability, and operational excellence. This position is remote but must be located in one of the following states: California, Washington, Texas, Tennessee, Florida, Colorado, or New York.  

The Senior Site Reliability Engineer (SRE) plays a key role in maintaining resilient infrastructure, monitoring critical services, and improving deployment and recovery processes across environments. The Senior Site Reliability Engineer works under the direction of the Director of Engineering and collaborates closely with Site Reliability Engineers, Automation Engineers, and other members of the engineering organization. 

This position will report to the Director of Engineering. 

Responsibilities 

  • Maintain scalable, secure, and reliable cloud services ensuring reliable system operations within Service Level Objectives. 
  • Implement and manage monitoring, alerting, and observability systems using Prometheus, Grafana, and Azure Monitor to proactively identify and resolve issues. 
  • Develop and maintain automation scripts and tools in PowerShell, Bash, and C# to improve deployment efficiency, system reliability, and developer productivity. 
  • Create, refine, and maintain detailed runbooks for production systems to ensure consistent operational procedures and effective incident response. 
  • Define and manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to measure and maintain system reliability. 
  • Collaborate with software engineers and automation engineers to integrate reliability practices into CI/CD pipelines using Azure DevOps. 
  • Design and implement intelligent alerting strategies that ensure high signal-to-noise ratios and enable rapid triage of critical issues. 
  • Participate in incident response, post-incident reviews, and blameless root cause analysis to drive continuous improvement of system reliability and uptime. 
  • Contribute to deployment strategy evolution, including blue-green and canary deployments, to minimize downtime and release risk. 
  • Collaborate closely with Automation Engineers to enhance automated validation and testing of production environments. 
  • Monitor system health, capacity, and performance, providing data-driven insights and recommendations for optimization. 
  • Conduct chaos engineering experiments and resilience testing to proactively identify and address system weaknesses. 
  • Develop and maintain disaster recovery and business continuity plans, including regular failover testing. 
  • Participate in the on-call rotation for platform services, ensuring high availability and rapid incident resolution. 
  • Proactively monitor and respond to production support tickets and alerts within established SLA timeframes, delivering first-level diagnosis, troubleshooting, and escalation as needed to maintain system reliability 
  • Continuously improve incident response playbooks and reduce Mean Time to Recovery (MTTR). 
  • Participate in sprint planning, stand-ups, and retrospectives to ensure alignment with development and operational objectives. 
  • Identify opportunities to improve resiliency, reduce toil, and strengthen the reliability culture across the engineering organization. 
  • Collaborate with security and compliance teams to ensure infrastructure meets regulatory and security standards. 
  • Support cost optimization efforts by monitoring cloud resource usage and recommending efficiency improvements. 
  • Explore and integrate AI/ML-based observability tools for predictive monitoring and anomaly detection. 
  • 8+ years of professional experience in site reliability, infrastructure, or systems engineering roles. 
  • Proficiency with Azure cloud infrastructure, services, and resource management 
  • Experience in operating systems, network concepts, protocols, and architecture. Microsoft/Linux operating systems, active directory, OSI.  
  • Technical ability in Node JS, .NET/C# and knowledge of both current and legacy architecture, software development practices, and conventions.  
  • Strong experience with Rest APIs 
  • Hands-on experience with containerization and orchestration using Kubernetes and microservices architecture. 
  • Strong automation and scripting skills in PowerShell, Bash. 
  • Experience with Infrastructure as Code tools for provisioning and configuration management. 
  • Deep understanding of CI/CD processes and tools, preferably using Azure DevOps. 
  • Experience implementing and managing observability solutions including Azure Monitor, Application Insights, and Log Analytics Workspaces, Prometheus and Grafana. 
  • Strong problem-solving, analytical, and troubleshooting abilities in distributed systems and cloud environments. 
  • Ability to write, maintain, and execute operational runbooks and automation for incident management and recovery. 
  • Ability to work self-directed, plan and execute projects involving multiple technical resources and stakeholders. 
  • Excellent communication and collaboration skills, with the ability to work across software development, infrastructure, and operations teams. 

Preferred Education and Experience 

  • Bachelor’s degree in Computer Science, Engineering, or related technical field. 
  • Experience working in Agile/Scrum delivery environments. 
  • Experience supporting .NET applications and microservices in a production environment. 
  • Experience supporting SQL Server and Cosmos DB applications in production environments. 
  • Knowledge of network fundamentals, load balancing, and high-availability architectures. 

Supervisory Responsibility 

This position does not include supervisory responsibilities but provides mentorship and technical guidance to the Site Reliability team members. 

Travel Requirements 

Yearly travel to the San Diego office headquarters is expected for this position. 

Work Environment 

This job operates in a professional office environment. This role routinely uses standard office equipment such as computers, phones, photocopiers, filing cabinets, and fax machines. This role occasionally must lift and carry office equipment. 

 
Physical/Mental Demands 

  • Physical – This is largely a sedentary role. 
  • Mental – Problem-solving, making decisions, interpreting data, organizing, reading/writing. 
  • Reasonable accommodation may be made to enable individuals with disabilities to perform the essential functions. 

Work Location 

Due to state law and tax implications, remote work candidates must live and work in one of the following states: California, Washington, Texas, Tennessee, Florida, Colorado, or New York. 

  • Paid Time Off, Paid Sick Leave, Paid Holidays
  • Competitive Medical, Dental, Vision, and Life Insurance
  • 401(k) plan with discretionary match available
  • Flexible Spending Account (FSA), Health Savings Account (HSA)
  • Voluntary benefits including Critical Illness, Group Accident, and Voluntary Life
  • Employee Referral Program
  • Exposure to a growing ecommerce company
  • Discounts on the GOVX website

Salary Range
$165,000 - 175,000 Annually

AAP/EEO Statement
EOE. Veterans/Disabled.  Reasonable accommodation may be made to enable individuals with disabilities to perform the essential functions. 

Position will require successful completion of a background check and drug testing prior to starting employment.


About GOVX, Inc.


Savings for Those Who Serve

GOVX was founded in 2011 to offer exclusive benefits to those who serve our country. The GOVX membership is comprised of current and former members of the United States military, law enforcement, firefighting, medical services, and government personnel. We are dedicated to supporting these communities and to offering unique value to our members, while delivering an authentic platform for brands to reach our growing customer base. As the largest and fastest growing digital platform serving this deserving audience, we are committed to stretching the limits of ecommerce to deliver the best assortment for our members’ on-duty and off-duty needs.

Average salary estimate

$170000 / YEARLY (est.)
min
max
$165000K
$175000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Posted 10 hours ago

An undergraduate software engineering intern role contributing to the Crash Reporting System at PlayStation Studios, gaining hands-on experience in full stack development and cloud/DevOps for large-scale developer tools.

MLabs Hybrid No location specified
Posted 13 hours ago

Build and productionize the full-stack systems and interfaces for a high-impact AI financial agent at MLabs, working across backend, frontend, data pipelines, and LLM integrations.

Posted 23 hours ago

Temporal is hiring a Senior Software Engineer on the Release Engineering team to design, build, and operate fully automated release and deployment pipelines using Temporal and modern cloud-native tooling.

Photo of the Rise User
Posted 9 hours ago

Lead the technical direction for Databricks' Enterprise Applications as a Senior Staff Backend Software Engineer, building scalable AI-enabled internal platforms that power the business.

Photo of the Rise User
Okta Hybrid San Francisco, California
Posted 15 hours ago
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Maternity Leave
Paternity Leave
401K Matching
Paid Holidays
Paid Sick Days
Paid Time-Off
Paid Volunteer Time
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Family Coverage (Insurance)
Medical Insurance
Mental Health Resources

Lead development of shared microservices frameworks and platform tooling at Okta to enable secure, observable, and scalable service deployments across the organization.

Photo of the Rise User
Posted 4 hours ago

Senior Software Engineer needed to architect and build scalable, user-facing systems and generative AI integrations that improve government services and public programs.

Photo of the Rise User
Dental Insurance
Flexible Spending Account (FSA)
Vision Insurance
Paid Holidays

Lead and prototype partner-ready integrations and reference architectures that extend 1Password’s platform across identity, security, cloud, and AI ecosystems.

Photo of the Rise User
Posted 22 hours ago

Help architect and build a greenfield agentic AI platform at Apiphani as a Senior Software Engineer, owning fullstack systems across TypeScript, Go, and Python.

Photo of the Rise User
Posted 2 hours ago

Hippocratic AI is hiring an AI Engineer to build production-grade, voice-enabled LLM systems that enable clinically safe conversational agents for healthcare.

Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Social Impact Driven
Rapid Growth
Maternity Leave
Paternity Leave
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Paid Holidays
Paid Time-Off

Samsara is hiring a Senior Software Engineer for the Smart Trailer & Connected Equipment team to build end-to-end IoT and full-stack solutions that track and monitor trailers and unpowered assets at scale.

Photo of the Rise User
Posted 13 hours ago

Quantiphi is hiring a full-stack Software Developer skilled in Angular and Python/Node.js to build scalable, secure cloud-native microservices and web applications on GCP.

Photo of the Rise User
Posted 17 hours ago

Medical Guardian is hiring a Senior Software Engineer to help modernize and build cloud-native .NET and React services on Azure for a large connected-health platform while mentoring other engineers.

Photo of the Rise User
Posted 13 hours ago

Relativity Space is hiring a Robotics Software Engineer II to design algorithms and tools for generating complex geometries and valid multi-axis robot motion for additive manufacturing.

GovX.com is for men and women of service. A members-only online destination for military and first responders, the site offers thousands of products from hundreds of brands at exclusive, below-retail pricing. Members include active and veteran U.S...

1 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
October 30, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!