At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses.
The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles, Miami, Dallas, Atlanta and Chicago while doing commercial deliveries. We’re looking for talented individuals who will grow robotic deliveries from surprising novelty to efficient ubiquity.
We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine learning and computer vision, among other disciplines, with a mindful eye towards the end-to-end user experience. Our team is agile, diverse, and driven. We believe that the best way to solve complicated dynamic problems is collaboratively and respectfully.
As a Senior DevOps Engineer on the Machine Learning (ML) Infrastructure team, you will help design, build, and maintain our petabyte-scale data and ML platform that powers data partnerships, ML research, and autonomy engineering. You will play a key role in ensuring reliability, security, scalability, and performance across our internal systems, and maintain a suite of internal tools used by dozens of engineers. Your work will make a significant impact on our autonomous capabilities and act as a catalyst for the entire autonomy team, helping us train our next generation of ML models.
Responsibilities
Deploy and maintain our ML training orchestration system that operates across multiple platforms.
Manage cloud and on-premise environments for large-scale distributed data processing and ml training/inference systems.
Automate deployment pipelines, monitoring, and alerting for ML and data services.
Collaborate closely with data scientists, ML engineers, and autonomy teams to streamline experimentation and model deployment.
Maintain and improve CI/CD systems to support rapid development and testing.
Implement best practices for system security, reliability, and observability.
Optimize infrastructure costs and ensure efficient resource utilization.
Support internal developer productivity through tooling, documentation, and support.
Qualifications
Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent experience.
5+ years of experience as a DevOps, SRE, or Infrastructure Engineer, preferably supporting ML or data-intensive systems.
Strong experience with cloud platforms (AWS, GCP, or Azure) and container orchestration (Kubernetes, Docker).
Proficiency in infrastructure-as-code tools such as Terraform or Helm.
Solid understanding of CI/CD systems (GitLab CI, Jenkins, ArgoCD, etc.).
Experience with Python and SQL
Experience with cloud security, IAM (Identity and Access Management), and access control
Experience analysing and optimizing hardware performance
Experience with GPU cluster management
What Makes You Stand Out
Experience managing large-scale distributed data processing systems.
Experience analysing and optimizing ml training workloads
Background in observability stacks (Prometheus, Grafana, ELK, OpenTelemetry).
Contributions to open-source DevOps or ML infrastructure projects.
* Please note: The base salary range listed in this job description reflects compensation for candidates based in the United States. While we prefer candidates located in the U.S, we are also open to qualified talent working remotely across:
Canada - Base salary range (Canada - all locations): $130k - 160k CAD
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead the behavior planning architecture for Serve’s sidewalk delivery robots, owning design, validation, and cross-team integration to ensure safe, socially-aware urban navigation.
Serve Robotics is hiring a Systems Test Engineer to create and scale simulation-driven autonomy tests and CI integrations that validate system-level behaviors and detect regressions.
An entry-level firmware engineer role developing embedded and application software for Radionix security systems, ideal for candidates with solid programming fundamentals in C/C++ or C#.
Freed is hiring a Site Reliability Engineer to design and operate secure, observable cloud infrastructure for its AI-driven clinical products.
Bugcrowd is hiring a Senior Site Reliability Engineer to build and operate scalable, secure AWS infrastructure and CI/CD automation for its Security Knowledge Platform.
Alignerr is hiring a Senior Python Full-Stack Engineer to build and optimize robust AI data pipelines, backend services, and tooling for evaluation and annotation workflows.
Medtronic's ACM team is hiring a Software Engineer I to develop and validate embedded firmware for next-generation airway-management medical devices.
Lead cross-platform Flutter app development and backend API design at a YC-backed SaaS startup transforming the restoration industry.
Experienced backend engineer needed to design and scale core Cortex platform services at Palo Alto Networks, contributing across the full development lifecycle.
RIVO seeks an experienced Software Engineer to build and modernize features for its in-house Loan Management System using .NET Core/.NET 6+, Azure, and Angular in a fully on-site role.
FleetWorks seeks a Senior Software Engineer to own and ship end-to-end TypeScript features powering a voice-driven freight marketplace at our SOMA office.
Lead the design and reliability of CyberArk's cloud-native SaaS platform as a Staff Site Reliability Engineer, driving AWS architecture, observability, and automation at scale.
Join Anduril's Manufacturing Automation team to build and scale Forge — a mission-critical platform integrating enterprise and on-prem systems to optimize production lines.
Work on Robinhood's Genesis team as a Frontend Web Developer, owning features from design to launch and building performant React/TypeScript products for tokenization and fund admin workflows.
Lead the behavior planning architecture for Serve’s sidewalk delivery robots, owning design, validation, and cross-team integration to ensure safe, socially-aware urban navigation.
Why deliver a 2-pound burrito in a 2-ton car? Serve is the future of sustainable, self-driving delivery. Our zero-emissions rovers are designed to serve people in public spaces, starting with food delivery. We partner with platforms and merchants ...
8 jobs