Browse 111 exciting jobs hiring in Reliability now. Check out companies hiring such as Jobgether, Rokt, Mainstay Labs Inc. in Sacramento, Mesa, Indianapolis.
Lead and scale global, mission-critical SaaS support operations as a Senior Manager focused on operational rigor, cross-functional collaboration, and customer excellence.
Rokt is hiring an experienced Engineering Manager (SRE) to lead production engineering, harden cloud infrastructure at scale, and develop a high-performing SRE team.
Mainstay is hiring a Staff Software Engineer to lead full-stack development of scalable, production-ready systems and user experiences using Python, TypeScript, React, and AWS.
Evolv Technology seeks a Quality Analytics Intern in Waltham to build automated reporting, analyze failure trends, and drive improvements in product reliability across multiple platforms.
Experienced cloud-focused Senior Software Engineer wanted to build and operate scalable infrastructure and developer tools across AWS, Kubernetes, and Cloudflare for enterprise platforms.
Salesforce’s Platform Orchestration team seeks a Senior Software Engineer to lead development of scalable orchestration and CI/CD systems that improve reliability, compliance, and developer velocity across Slack’s cloud infrastructure.
Build production-grade orchestration and integration software that enables fleet robotics and enterprise workflows at Nimble's San Francisco robotics engineering team.
Lead a core product engineering team at Wise in Austin, driving technical direction, roadmap delivery, and engineering growth for products used by millions globally.
Lead Mechanical Engineer at Kimberly‑Clark responsible for improving asset performance, leading technical projects, and supporting high‑speed tissue manufacturing operations in Marinette, WI.
Lead the architecture and operation of NVIDIA's global observability platform to ensure reliable, high-performance telemetry for large-scale AI and data systems.
Tyk is hiring a hands-on Technical Lead (EMEA, remote) to define architecture and build scalable non-functional engineering capabilities (observability, CI/CD, testing, performance) while mentoring teams and delivering measurable impact.
Lead DevOps Engineer needed to architect and modernize CI/CD and cloud infrastructure for large-scale enterprise applications in Dallas, TX.
Evolv Technology is hiring a Quality Analytics Intern to analyze quality data, build automated dashboards, and support reliability improvements for fielded product platforms.
Kalshi is hiring a Site Reliability Engineer to strengthen observability, automate operations, and scale reliable production services for its fast-growing prediction markets platform.
At Campfire, this in-office DevOps Engineer role owns AWS infrastructure, Terraform automation, observability, and production reliability to support a fast-growing accounting SaaS product.
WEX seeks an experienced Senior Staff SRE to define and execute enterprise reliability strategy, build resilient systems, and lead cross-functional initiatives that improve scale, observability, and operational excellence.
Sequen AI seeks a Staff Software Engineer (Infrastructure) to own and scale high‑performance cloud and ML infrastructure supporting training, research, and serving of frontier ranking models.
SpaceX Starshield is hiring a Senior Site Reliability Engineer to build and operate secure, highly available infrastructure supporting national-security satellite and communications systems.
Experienced verification engineer needed to develop and execute design verification and reliability test methods for robotic surgical systems and instruments at a leading medical device company in Sunnyvale.
Seasoned platform engineering leader to define and deliver cloud-first platform strategy and modernization at T. Rowe Price, ensuring reliability, scalability, and operational excellence across global infrastructure.
Senior cloud engineering leader to oversee AWS-based platform, SRE, and systems teams, driving FinOps, observability, and large-scale infrastructure modernization in a remote-first setting.
Senior infrastructure engineer needed to drive resiliency, observability, and scalable real-time systems for Orb's billing platform in a hybrid San Francisco office environment.
NBCUniversal is hiring a Site Reliability Engineer to build, operate, and enhance monitoring and control systems for its IP video distribution and on-air broadcast environments.
Medtronic is seeking a Sr. Engineering Manager to lead electrical and firmware reliability efforts for cardiac implantable devices, ensuring safety, compliance, and high product reliability while growing a technical team.
Experienced SRE leader needed to architect, automate, and operate cloud-native infrastructure to deliver reliable, scalable services across regulated environments.
Lead architecture, integration, and production hardening of event-driven, containerized digital systems to scale manufacturing and maritime autonomy capabilities at Anduril.
Lead Marvell's foundry technology organization to drive NPI, yield enhancement, reliability and high-volume production ramp for cutting-edge SoC products.
OnePay seeks an experienced Site Reliability Engineer to improve platform reliability and observability for a high-scale consumer fintech platform serving millions of users.
Lead the mobilization and operational readiness of new and transitioning data center sites for T5, ensuring seamless handoff to operations and full compliance with company standards.
Zeta is hiring a Data Reliability Engineer II in Basking Ridge to ensure the performance, security, and reliability of cloud databases and data pipelines for a high-scale banking platform.
Broadcom is hiring a hands-on Reliability Engineer to lead planning and execution of product qualifications for advanced packages within the GO Q&R team in Fort Collins, CO.
Work on NVIDIA's DGX Cloud team to design and operate large-scale Kubernetes-based GPU clusters that power cutting-edge AI workloads.
Experienced engineering leader sought to manage and grow an SRE team that ensures reliability, scalability, and operational excellence for cloud-native production systems.
Design and ship fault-tolerant backend services at Junction to process millions of device and lab data points and power clinical insights.
Build and scale the compute and infrastructure that powers Chai Discovery's next-generation AI drug design platform as a Software Engineer, Infrastructure.
Senior SRE leader needed to shape reliability practices, mentor engineers, and deliver resilient, scalable cloud infrastructure for a high‑throughput fintech platform.
Experienced electrical-focused Plant Engineering Manager needed to lead maintenance, capital projects, and reliability initiatives at Signode's Florence, KY manufacturing facility.
Help build the core compute delivery platform for a San Francisco startup creating a liquid market for GPU offtake as a Software Engineer focused on cloud and systems programming.
Valinor is looking for an Infrastructure & Security Engineer to design, operate, and secure CI/CD pipelines and cloud/edge infrastructure for defense-focused products across its portfolio.
Lead Peacock's SRE and DevSecOps efforts as Manager, guiding cloud architecture and engineering teams to deliver secure, scalable streaming services for millions of users.
Senior Software Engineer - Reliability (remote, CA) to help build foundational SRE practices, observability, and infrastructure automation for secure, compliant cloud production systems.
Help architect and operate cloud-native, AI-powered platforms as a Software Engineer (SRE) focused on reliability, automation, and scalable microservices.
Help operate and scale a high-performance GPU cluster used by cutting-edge ML research and production teams as a Senior Site Reliability Engineer.
Experienced cloud-native engineer needed to lead design and automation of scalable Kubernetes platforms across AWS and OCI, driving reliability, cost optimization, and developer experience.
Lead the design and build-out of network operations and reliability for Fluidstack's distributed datacenter fabric, owning Tier 2+ incident response, observability, automation, and team development.
Trunk is hiring a Forward Deployed Engineer to lead end-to-end private and on-premises deployments, collaborate with enterprise IT, and ensure secure, reliable operation of its CI Reliability Platform.
Boeing seeks Experienced or Lead Systems Engineers in Hazelwood, MO to define, integrate, verify and validate complex aerospace systems across multiple programs with U.S. citizenship and security-clearance eligibility required.
Join Mosaic's Asset Management Strategy team in Riverview, FL to develop and implement reliability programs that reduce downtime and improve safety and asset performance.
Experienced Reliability Engineer needed to strengthen equipment performance and maintenance programs at a regulated pharmaceutical manufacturing site represented by QRC Group.
Lead reliability engineering for LinkedIn's massive streaming platform—designing, coding, and operating pub/sub infrastructure to ensure scalable, highly available data flow across the company.
Below 50k*
1
|
50k-100k*
4
|
Over 100k*
30
|