Production Reliability Jobs

Browse 14 exciting jobs hiring in Production Reliability now. Check out companies hiring such as Crusoe, MLabs, Fiddler AI in Minneapolis, Honolulu, Riverside.

VIEW COMPANIES

Production Engineer

Crusoe Hybrid No location specified

VIEW

Posted 12 hours ago

Work on Crusoe’s fleet operations to automate server provisioning, troubleshoot GPU hardware, and help transition infrastructure to Kubernetes for large-scale deployments.

Senior Backend Engineer

MLabs Hybrid No location specified

VIEW

Posted yesterday

Senior Backend Engineer needed to build and operate high-reliability TypeScript backend services for a mission-driven stablecoin infrastructure startup operating on an EST remote schedule.

Staff AI Software Engineer

Fiddler AI Hybrid No location specified

VIEW

Posted 7 days ago

Fiddler is hiring a Staff AI Software Engineer to architect and build scalable observability and evaluation systems for LLMs, GenAI, and agentic applications while mentoring engineering teams.

Engineering Manager, R&D

SharkNinja Hybrid SAN FRANCISCO, California

VIEW

Sponsored

Engineering Manager, R&D

SharkNinja Hybrid NEEDHAM, Massachusetts

VIEW

Sponsored

Mechanical Engineering Manager

SharkNinja Hybrid NEEDHAM, Massachusetts

VIEW

Sponsored

Senior AI Infrastructure Engineer, Cloud Partnerships - DGX Cloud

NVIDIA Hybrid US, CA, Santa Clara

VIEW

Posted 10 days ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Lead the integration of third-party infrastructure providers into NVIDIA’s operational systems and shape robustness for DGX Cloud as a Senior AI Infrastructure Engineer focused on cloud partnerships.

Senior Staff Production Engineer

Lightspark Hybrid No location specified

VIEW

Posted 14 days ago

Lead the architecture and operational strategy for Lightspark’s global production infrastructure to ensure secure, reliable, and scalable payment systems.

Staff Software Engineer - RaptorDB Provisioning

ServiceNow Hybrid 4810 Eastgate Mall, San Diego, California, United States

VIEW

Posted 16 days ago

Inclusive & Diverse

Mission Driven

Rise from Within

Diversity of Opinions

Work/Life Harmony

Empathetic

Feedback Forward

Take Risks

Collaboration over Competition

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Conferences Stipend

Paid Time-Off

Maternity Leave

Equity

ServiceNow is hiring a Staff Software Engineer to build and operate production-grade database provisioning tools that ensure reliable, scalable database operations across global data centers.

Production Support Engineer (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 17 days ago

Serve as the primary production support engineer for a remote US team, managing incidents, optimizing system performance, and working with cross-functional teams to improve reliability and scalability.

Founding Site Reliability Engineer (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 20 days ago

Lead SRE efforts as the founding Site Reliability Engineer at a fast-growing AI company, building scalable, secure, and observable AWS infrastructure and processes.

Founding Engineer

Applied Labs Hybrid New York City

VIEW

Posted 21 days ago

Applied Labs seeks a Founding Engineer in New York City to design, ship, and own production-grade AI agent systems focused on reliability and real-world performance.

Site Reliability Engineer (L4) - CORE

Netflix Hybrid USA - Remote

VIEW

Posted 24 days ago

Inclusive & Diverse

Rise from Within

Mission Driven

Diversity of Opinions

Work/Life Harmony

Customer-Centric

Fast-Paced

Growth & Learning

Medical Insurance

Dental Insurance

401K Matching

Paid Time-Off

Maternity Leave

Paternity Leave

Mental Health Resources

Flex-Friendly

Netflix is looking for a Site Reliability Engineer (L4) to enhance resilience, automation, and incident response for its streaming infrastructure in a remote role.

Product Development Manager

SharkNinja Hybrid NEEDHAM, Massachusetts

VIEW

Sponsored

Product Development Manager - Haircare

SharkNinja Hybrid NEEDHAM, Massachusetts

VIEW

Sponsored

Engineering Manager, R&D

SharkNinja Hybrid NEEDHAM, Massachusetts

VIEW

Sponsored

Systems Development Engineer L4

Netflix Hybrid USA - Remote

VIEW

Posted 24 days ago

Inclusive & Diverse

Rise from Within

Mission Driven

Diversity of Opinions

Work/Life Harmony

Customer-Centric

Fast-Paced

Growth & Learning

Medical Insurance

Dental Insurance

401K Matching

Paid Time-Off

Maternity Leave

Paternity Leave

Mental Health Resources

Flex-Friendly

Netflix seeks a Systems Development Engineer to design, automate, and operate scalable infrastructure and tooling that powers global content production workflows.

Senior Engineer, Site Reliability Engineering, Digital Banking

Forbright Bank Hybrid Remote

VIEW

Posted 24 days ago

Drive platform reliability and observability at Forbright as a Senior SRE, building automated, resilient cloud systems that support the bank's digital banking and commercial lending services.

Member of Technical Staff - Model Engineer

MLabs Hybrid No location specified

VIEW

Posted 26 days ago

Lead end-to-end model development and production deployment for fault prediction and autonomous repair as a Model Engineer on the founding AI team of a Series C networking startup in San Francisco.

Staff Production Service Engineer (SRE) - Cloud Operations - Federal

ServiceNow Hybrid 12900 Science Drive Suite 100, Orlando, Florida, United States

VIEW

Posted 26 days ago

Inclusive & Diverse

Mission Driven

Rise from Within

Diversity of Opinions

Work/Life Harmony

Empathetic

Feedback Forward

Take Risks

Collaboration over Competition

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Conferences Stipend

Paid Time-Off

Maternity Leave

Equity

Lead reliability and automation efforts for ServiceNow cloud operations in a Staff Production Service Engineer role supporting US Public Sector customers from the Orlando office.

Employment type

Remote/Onsite

Application Type

Date Posted

Department

Work Experience

Industries

Skills

Company size

Funding

Company Culture

Benefits & Perks