Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Site Reliability Engineer (SRE) image - Rise Careers
Job details

Site Reliability Engineer (SRE) - job 1 of 2

Company Overview

Procurement Sciences AI (PSci.AI) is at the forefront of generative artificial intelligence, transforming the government contracting sector as a Series A rocketship, proudly backed by Battery Ventures, a top 1% global technology venture capital firm. As a venture-backed B2B SaaS company, we are dedicated to revolutionizing federal, state, and local business development with disruptive AI capabilities. Our “Win More Bids” platform delivers unparalleled operational efficiencies for our clients and drives new revenue streams. By harnessing the power of generative AI tailored for the government contracting domain, we provide a unique competitive advantage and redefine what is possible for our customers.

Job Title: Site Reliability Engineer (SRE)

Location: Washington, DC metro area; Salt Lake City, UT; or Remote

Job Description

We are seeking an experienced and driven Site Reliability Engineer (SRE) to help ensure the reliability, performance, and scalability of our cloud-based AI solutions. The ideal candidate has a track record of diagnosing root causes, building automation, optimizing observability, and managing reliability in complex SaaS environments. Experience with Kubernetes, Helm, modern observability platforms, and major public cloud providers (Azure, AWS, Google Cloud Platform) is key. You will play a central role in defining and monitoring key reliability metrics, strengthening operational excellence, and championing DevOps culture across our rapidly growing organization.

Key Responsibilities

  • Identify and resolve system and application issues through in-depth root cause analysis, working closely with development teams and stakeholders.

  • Design, develop, and implement comprehensive automated testing to ensure ongoing system reliability and performance.

  • Build and maintain robust observability and monitoring solutions using Datadog, Prometheus, Grafana, ELK Stack, or similar platforms.

  • Define and monitor service level indicators (SLIs), service level objectives (SLOs), and service level agreements (SLAs) across services to meet operational commitments and improve reliability.

  • Collaborate with developers and operations staff to enhance system reliability, deployment agility, and overall developer experience (DevEx).

  • Develop and continuously improve monitoring and alerting systems to proactively address potential issues.

  • Lead and implement best practices for incident management, disaster recovery, and business continuity.

  • Manage high-impact incident response, facilitate post-mortem analyses, and drive remediation to prevent future occurrences.

  • Plan for capacity upgrades and scaling to support company growth and system performance requirements.

  • Automate operational tasks and infrastructure management using Infrastructure as Code (IaC) tools and related technologies.

  • Ensure all systems and processes comply with security, privacy, and regulatory requirements relevant to GovCon customers.

  • Continually assess and drive improvements in system architecture, operational processes, and documentation for systems and incidents.

Technical Requirements

  • Proficient in Kubernetes, Helm, and troubleshooting in secure and regulated environments.

  • Deep experience with observability and monitoring tools such as Prometheus, Grafana, ELK Stack, Datadog, or similar.

  • Hands-on expertise with major public cloud providers: Azure, Azure Gov, AWS, AWS GovCloud, and Google Cloud Platform (GCP).

  • Strong grasp of microservices architecture, cloud-native technologies, Postgres, and AI/ML systems.

  • Expertise in automated testing frameworks and practices (integration, synthetic, load testing, etc.).

  • Proficiency in tracking and analyzing reliability metrics (SLIs, SLAs, SLOs).

  • Excellent problem-solving skills and attention to detail, with the ability to operate independently and collaboratively.

  • Strong programming skills in TypeScript and Python.

  • Solid scripting abilities in Bash, PowerShell, or similar languages.

  • Demonstrated experience with Infrastructure as Code (IaC) tools such as Azure Bicep, AWS CDK, or Terraform.

  • Awareness of core networking principles and advanced troubleshooting skills.

  • Effective communicator, able to work with both technical and business personnel.

Preferred Qualifications

  • Experience in the GovCon sector and/or holding a security clearance.

  • Familiarity with GitOps principles and tools; experience with FluxCD is a plus.

  • Proven experience in designing, building, and maintaining CI/CD pipelines.

  • Experience managing reliability in multi-cloud or hybrid cloud environments.

  • Knowledge of security and compliance standards applicable to government SaaS and cloud systems.

  • Previous success operating in dynamic, high-growth SaaS companies.

  • Demonstrated expertise in operationalizing new development workloads across cross-functional teams.

Compensation and Benefits

  • Competitive salary and performance-based incentives, including stock options.

  • Comprehensive health plan for employees and their families.

  • Flexible remote-first work environment, with options to work from the DC metro area or Salt Lake City, UT.

  • Wide-ranging opportunities for professional development, technical advancement, and career growth.

To Apply:

Please submit your resume and a brief cover letter describing your experience with cloud-native reliability, Kubernetes, and observability in complex SaaS environments.

Notice: Background Check Required

As part of our employment process, a background check (including, but not limited to, credit history, criminal records, and employment verification) may be required in compliance with applicable law. By applying, you acknowledge and consent to this process.

Procurement Sciences AI is proud to be an equal opportunity employer committed to diversity and inclusion at all organization levels.

Average salary estimate

$170000 / YEARLY (est.)
min
max
$140000K
$200000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Posted 6 hours ago
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Paid Holidays

Kiddom is seeking an experienced Staff Systems Engineer to drive technical leadership, build scalable backend and platform services, and improve developer and user experiences across their educational platform.

Photo of the Rise User
Posted 5 hours ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

Join NVIDIA’s Product Security team to build SDLC security agents and backend platforms that automate OSS and developer security across CI/CD and version control systems.

Photo of the Rise User

Lead AI architecture and implementation at DoseSpot, designing GenAI solutions and guiding cross-functional teams to integrate LLM-driven capabilities into high-transaction healthcare SaaS products.

Photo of the Rise User
Posted 10 hours ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development

American Express is hiring a Senior Software Engineer to lead backend development of scalable, reusable shared service APIs using C#, Kotlin, Docker/Kubernetes and modern CI/CD practices.

Photo of the Rise User
Posted 2 hours ago
Inclusive & Diverse
Transparent & Candid
Mission Driven
Collaboration over Competition
Empathetic
Social Impact Driven
Rise from Within
Work/Life Harmony
Maternity Leave
Paternity Leave
Family Coverage (Insurance)
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Paid Time-Off

Notion seeks an Early-Career Infrastructure Software Engineer to help design, ship, and operate scalable, reliable infrastructure and tooling that powers a global user base.

Posted 15 hours ago

FoundersCard is hiring a Full Stack Rails Engineer to lead front-end implementations and modernize the web and mobile member experience from our Midtown Manhattan office.

Photo of the Rise User
Saronic Hybrid No location specified
Posted 23 hours ago

Provide field-facing technical leadership for deployed maritime autonomy systems, driving troubleshooting, integration, and customer support for Saronic Technologies.

Photo of the Rise User
Posted 11 hours ago

Experienced Full Stack Engineer (.NET + Vue/React) needed to build scalable, secure SaaS features for a leading global insurtech platform.

Posted 23 hours ago

SoloPulse seeks a hands-on Software Engineer Intern/Co-op to help develop algorithms and full-stack software for state-of-the-art radar sensing systems.

Posted 24 hours ago

Gatik seeks an experienced Site Reliability Engineer to own the reliability, monitoring, and scaling of the infrastructure that powers its autonomous middle‑mile fleet at the Mountain View office.

Photo of the Rise User
Posted 14 hours ago

Red Wing Shoe Company is hiring a Software Developer to design, implement, and support C# and Azure-based applications and integrations in a collaborative hybrid environment.

Photo of the Rise User
REGENT Hybrid North Kingstown, Rhode Island
Posted 4 hours ago

Lead the design and delivery of safety-critical embedded software for REGENT’s seaglider product line, driving architecture, integration, and testing from bench to sea trials.

Photo of the Rise User

Quizlet is hiring a Senior Fullstack Engineer on the Activation & Retention team to design and ship experiments that increase user onboarding and retention using React, NextJS and server-side technologies.

Procurement Sciences is a trusted partner for GovCon, aerospace, defense, education, and other government-oriented businesses, offering a transformative platform powered by breakthrough advancements in generative AI. Procurement Sciences turns dat...

6 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, remote
DATE POSTED
August 18, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!