This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Software Engineer - Reliability (Remote) in California (USA).
We are seeking a Senior Software Engineer specializing in Reliability to help design, implement, and operate systems that ensure cloud-based production environments remain secure, compliant, and highly available. In this role, you will be a foundational member of a new Site Reliability Engineering (SRE) team, building processes and infrastructure to support mission-critical workloads in regulated environments. You will collaborate with engineering, product, and operational teams to define service-level objectives, develop monitoring and automation, and improve overall system reliability. The ideal candidate is experienced in cloud infrastructure, automation, and observability, and enjoys solving complex distributed system challenges. This role offers the opportunity to shape the SRE culture and practices from the ground up, while contributing to high-impact projects that support regulated and commercial operations.
· Design and implement observability practices including metrics, traces, dashboards, logs, and alerting for production systems.
· Partner with engineering, product, and lab teams to define SLIs/SLOs, error budgets, and incident response procedures.
· Develop and maintain operational playbooks and runbooks for reliability and compliance.
· Participate in on-call rotations, championing automation and self-healing for production systems.
· Contribute to deployment processes and infrastructure automation using Infrastructure as Code (IaC).
· Collaborate on incident reviews, postmortems, and disaster recovery exercises to improve system reliability.
· Mentor peers, promote best practices, and help establish the SRE culture and strategy.
· Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
· 5+ years of experience in software engineering, SRE, or DevOps roles (Python or Go preferred).
· Hands-on experience deploying and operating production workloads in cloud environments (AWS, GCP, or Azure).
· Expertise in Infrastructure as Code (Terraform, Pulumi, Bicep/ARM).
· Experience with incident management platforms (e.g., Incident.io, ServiceNow, Opsgenie, PagerDuty).
· Strong knowledge of Kubernetes (AKS, GKE, EKS) and cloud networking.
· Proficiency with observability platforms such as DataDog, Prometheus/Grafana, or OpenTelemetry.
· Excellent troubleshooting, root-cause analysis, and automation skills.
· Ability to work autonomously and collaborate effectively with cross-functional teams.
· Experience in regulated environments (healthcare, biotech) and familiarity with compliance-driven change management is a plus.
· Competitive salary: $131,325–$201,000 USD, with potential for pre-IPO equity and cash bonuses.
· Comprehensive medical, dental, and vision coverage.
· Paid time off and holidays.
· Remote work flexibility.
· Opportunities for professional growth, mentorship, and leadership in a foundational SRE team.
· Participation in shaping processes for high-reliability systems in regulated environments.
Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role.
Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or additional assessments) are then made by their internal hiring team.
Thank you for your interest!
#LI-CL1
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Contribute to scalable web applications as a remote Java Software Engineer on an agile team focused on quality, testing, and continuous learning.
Work as a full-stack Software Engineer on a global IoT platform, building high-performance backend services, APIs, and dashboards that process massive volumes of data.
Experienced front-end engineer needed to deliver accessible, high-performance consumer web apps using React and TypeScript for a remote-friendly U.S. team.
KLA is hiring a VC++ Software Developer Intern to develop and test Windows-based, networked semiconductor manufacturing software using C++, MFC, multithreading, and SQL.
Lead a remote engineering organization to design and deliver financial planning, project management, and analytics tools that enable OCIDO strategic objectives and operational excellence.
Experienced, execution-focused engineering leader wanted to own delivery excellence and scale Bedrock Data's engineering organization through disciplined processes, quality improvements, and team growth.
United Field Services is hiring a Senior Software Architect to define and own the platform’s architecture, eliminate conflicting logic, and lead the system redesign for stability, scalability, and AI readiness.
Senior Back-End Engineer needed to design and operate resilient, high-throughput microservices for bet365's US Sports platform while coaching junior engineers and working across cloud-native infrastructure.
Work directly with customers and cross-functional teams to build and ship production-grade integrations and features that turn advanced data capabilities into measurable business impact.
Salesforce is seeking a Backend Software Engineer to develop and operate telemetry and observability services for Commerce Cloud, focusing on resilient, cloud-native SaaS systems.
Join a fast-moving team as a Full Stack Product Engineer to design and ship AI-powered end-to-end products that integrate LLMs, APIs, and modern front-ends across the US remote workforce.
Build full-stack AI features for CSBio's internal manufacturing tooling as an onsite, 3-month AI Software Engineering Intern in Mountain View.
Experienced Backend Developer needed to own and optimize Salesforce Commerce Cloud backend architecture, integrations, and performance for a fully remote eCommerce platform.
Experienced API Engineer needed to build secure, high-performance APIs and integrations that power cloud-native data and AI platforms.
Contribute as a Graph Database / Backend Engineer focused on Neo4j and Cypher to design high-performance graph schemas, optimize traversals, and build scalable ingestion pipelines for an AI-driven data intelligence platform.
Jobgether has the ambition to disrupt the recruitment industry as we know it by simplifying it and making it more accurate 🎯 Jobgether platform connects candidates and companies based on: - Skills -... Values - Ambition - Personality The candidat...
572 jobs