This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Sr. Reliability Engineer in United States.
This role offers the opportunity to lead the development of highly reliable, scalable SaaS systems while integrating cutting-edge AI into operational workflows. You will design and implement autonomous reliability operations, predictive monitoring, and self-healing infrastructure to ensure high availability across distributed cloud environments. Collaborating closely with engineering teams, you will embed AI-driven reliability practices into CI/CD pipelines, optimize incident response, and mentor peers on next-generation SRE practices. The ideal candidate has deep expertise in large-scale SaaS systems, cloud infrastructure, observability, and AI-assisted automation. This is a highly visible role where your work directly impacts system resilience, operational efficiency, and customer satisfaction, in a forward-thinking, AI-first environment.
· Design, implement, and scale reliable SaaS systems with a focus on autonomous and AI-driven operations.
· Build AI-enhanced observability, anomaly detection, and predictive monitoring using tools such as Prometheus, Grafana, Loki, Tempo, and OpenTelemetry.
· Develop automated workflows using Agentic AI frameworks, AI Flow tools, or custom AI agents to remediate production incidents.
· Collaborate with development teams to embed AI-assisted reliability feedback loops into CI/CD pipelines.
· Define, track, and optimize SLOs, SLIs, and SLAs to improve system reliability and operational efficiency.
· Mentor engineers on AI-driven SRE practices and contribute to reliability playbooks and process improvements.
· Troubleshoot and optimize production environments leveraging LLM-based diagnostics and pattern recognition across logs, traces, and metrics.
· 5+ years of experience in large-scale SaaS or distributed systems environments.
· Bachelor’s degree in Computer Science, Engineering, or equivalent technical experience.
· Hands-on experience with AI-driven operations, including Agentic AI, AI Flow tools, and AIOps practices.
· Proficiency in software engineering with Go and Python, automation frameworks, and API integrations.
· Expertise in observability and monitoring tools, cloud platforms (AWS, GCP, Azure), and Infrastructure as Code (Terraform or Pulumi).
· Strong knowledge of containerization and orchestration (Kubernetes, Docker), Linux system administration, and cloud networking principles.
· Excellent communication skills for documenting, reporting, and collaborating with cross-functional teams.
· Ability to participate in a rotating on-call schedule for production systems.
· Preferred: Experience integrating AI copilots into production systems, predictive SLO dashboards, or autonomous agent orchestration frameworks.
· Competitive compensation package, including performance-based bonuses.
· Comprehensive health, life, and disability insurance plans.
· Paid time off and parental leave.
· Remote-friendly work environment with occasional on-site collaboration.
· Opportunities for career growth, mentorship, and contribution to AI-first engineering practices.
· Inclusive and diverse workplace culture with employee resource groups.
Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.
When you apply, your profile undergoes an AI-powered screening process designed to identify top talent efficiently and fairly:
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 We automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is complete, it is shared with the hiring company, which handles final decisions and next steps.
Thank you for your interest!
#LI-CL1
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Join a fast-moving team as a Full Stack Product Engineer to design and ship AI-powered end-to-end products that integrate LLMs, APIs, and modern front-ends across the US remote workforce.
Experienced qualitative researcher wanted to lead global pharma studies, craft insightful client-ready reports, and drive strategic recommendations in a fully remote, EST-based role.
Work as a Software Engineer building cloud-native microservices and AI-enabled solutions in a remote, collaborative environment.
Lead a growing engineering team at Driftrock to architect and deliver scalable, event-driven backend systems while mentoring engineers and influencing product and technical strategy.
Experienced API Engineer needed to build secure, high-performance APIs and integrations that power cloud-native data and AI platforms.
United Field Services is hiring a Senior Software Architect to define and own the platform’s architecture, eliminate conflicting logic, and lead the system redesign for stability, scalability, and AI readiness.
Senior Software Engineer needed to build and maintain cloud-native Microsoft 365 and Azure solutions, leveraging SPFx, Power Platform, and modern full-stack technologies to drive digital transformation.
Contribute to scalable web applications as a remote Java Software Engineer on an agile team focused on quality, testing, and continuous learning.
Samsara is hiring a Senior Software Engineer to be the technical owner of its Zoom CCaaS Post-Sales platform, building routing, conversational AI, Python-based integrations, and real-time data workflows.
Work directly with customers and cross-functional teams to build and ship production-grade integrations and features that turn advanced data capabilities into measurable business impact.
Work on GolfPass's cross-platform mobile and OTT video apps, building features in React Native and native platforms for a leading media company.
Strategic engineering leader wanted to oversee a global software organization and drive AI-enabled, secure, and scalable web, mobile, and API solutions that deliver measurable business value.
Allergan Aesthetics is hiring a Staff Software Engineer to drive architecture, mentor engineers, and build scalable, consumer-facing features for the Allē platform in Salt Lake City.
Experienced, execution-focused engineering leader wanted to own delivery excellence and scale Bedrock Data's engineering organization through disciplined processes, quality improvements, and team growth.
Sail with Ivo as a Frontend Engineer to build fast, AI-powered, real-time UIs that rescue overworked lawyers and chart new courses in legal tech.
Jobgether has the ambition to disrupt the recruitment industry as we know it by simplifying it and making it more accurate 🎯 Jobgether platform connects candidates and companies based on: - Skills -... Values - Ambition - Personality The candidat...
569 jobs