Job details

Founding Backend Engineer - LLM Orchestration

About Nestmed

Nestmed is redefining post-acute healthcare with AI-driven technology that helps clinicians work more efficiently and provide better patient care.

Within one year, we've processed over half a million patient visits, with tens of thousands of clinicians using our product daily. We're now working with 7 of the top 10 post-acute healthcare enterprises in the U.S., helping shape the future of home healthcare delivery.

Founded by Stanford and YC alumni with deep healthcare and AI expertise, our founding team combines years of clinical experience with cutting-edge technical backgrounds from companies like Amazon, Google, Meta, and leading healthcare organizations. Backed by top investors including SciFi VC (Max Levchin, PayPal co-founder) and Mischief Capital (Plaid founder), we're building the next generation of healthcare infrastructure.

About the Role

As the founding Backend Engineer on our LLM Orchestration team, you'll be deploying and managing LLMs at scale, learning how to orchestrate them in complex production scenarios that directly impact patient care. You'll rebuild and maintain our core AI inference engine that powers all of Nestmed's intelligent capabilities across several thousand clinical conversations daily.

Our system orchestrates over a dozen different AI models - both fine-tuned in-house models and third-party APIs - with low latency and high availability. You'll work on complex technical challenges like intelligent model routing based on clinical context, implementing sophisticated fallback strategies across multiple providers, optimizing inference costs through batching and caching, and ensuring clinical accuracy through comprehensive model evaluation pipelines.

This isn't about calling OpenAI APIs. You'll build sophisticated orchestration logic that selects optimal models for each clinical task, implements custom retry and circuit breaker patterns for provider failures, manages rate limits across multiple concurrent workflows, and maintains detailed performance metrics across the entire AI pipeline. You'll start as the solo engineer on this critical infrastructure and grow it into a robust team handling core AI engineering.

What You'll Do

Build and optimize our core AI inference engine that routes requests across multiple LLM providers based on clinical context, cost optimization, and latency requirements
Design robust model serving infrastructure with intelligent load balancing, failover mechanisms, and A/B testing frameworks for model evaluation in production
Implement production-grade AI pipelines with comprehensive observability, distributed tracing, and real-time performance monitoring for healthcare-critical workloads
Optimize inference costs and latency through intelligent request batching, response caching, model quantization, and dynamic provider selection algorithms
Build custom model fine-tuning and deployment pipelines for healthcare-specific tasks using frameworks like Transformers, vLLM, and distributed training infrastructure
Create sophisticated prompt engineering systems that dynamically optimize prompts based on clinical context and historical model performance data
Design comprehensive evaluation frameworks that continuously monitor model accuracy, clinical safety, and regulatory compliance across all deployed models
Build model versioning and deployment systems that support safe rollouts, instant rollbacks, and controlled experimentation in production healthcare environments

What You Bring

6+ years of backend engineering experience building high-performance distributed systems, with focus on latency-critical applications and reliability engineering
Deep production experience with LLMs including multi-provider orchestration, custom model serving, and building reliable inference infrastructure at scale
Strong expertise in ML infrastructure including model serving frameworks (TensorRT, vLLM, TorchServe), distributed training, and GPU optimization
Experience with model evaluation and monitoring including A/B testing frameworks, performance monitoring, and building comprehensive observability for ML systems
Proficiency in Python and ML frameworks with hands-on experience in model fine-tuning, prompt engineering, and deploying custom models to production
Track record scaling ML systems with experience optimizing inference costs, managing multiple model providers, and building reliable AI infrastructure
Understanding of healthcare or regulated industries where model accuracy, auditability, and compliance are mission-critical requirements
San Francisco-based and excited about working closely with AI researchers to productionize cutting-edge models for healthcare applications

Why This Role Matters

You'll be building the AI infrastructure that processes millions of patient interactions, directly impacting care quality for thousands of patients daily. Every optimization you make reduces healthcare costs, improves clinical accuracy, and enables new AI capabilities that transform patient outcomes.

You'll start as the founding ML infrastructure engineer and build this into a world-class AI platform team. Join us in San Francisco to build the most sophisticated LLM orchestration system in healthcare alongside leading AI researchers and clinical experts.

If you’re passionate about building high-impact products that solve real-world problems, we’d love to hear from you. Apply today!

Backend Engineer LLM Orchestration AI Infrastructure Healthcare AI Model Serving Distributed Systems Python Transformers vLLM Latency Optimization Model Evaluation Prompt Engineering ML Pipelines A/B Testing Healthcare Compliance

Average salary estimate

$185000 / YEARLY (est.)

min

max

$150000K

$220000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Senior Full Stack Software Engineer

Jobgether Hybrid No location specified

VIEW

Posted 6 hours ago

Experienced Full Stack Engineer wanted to lead innovative development at MNTN’s next-generation marketing platform in a 100% remote capacity.

Software Engineering Intern

Lynk Hybrid Falls Church

VIEW

Posted 21 hours ago

Contribute to innovative satellite technologies as a Software Engineering Intern at Lynk, bringing global connectivity closer to reality.

Software Engineer I, Fullstack

Pinterest Hybrid San Francisco, California, United States

VIEW

Posted 23 hours ago

Innovate as a Full-stack Software Engineer I at Pinterest, developing scalable features that inspire millions around the world.

Senior Software Engineer II (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 6 hours ago

An opportunity to lead backend service architecture and development at Splice, driving scalable solutions remotely across the US.

Senior Software Engineer (Remote)

Experian Hybrid United States, United States, UNITED STATES, United States

VIEW

Posted 22 hours ago

Experienced Senior Software Engineer needed to develop and manage Apache Kafka streaming solutions for Experian's global technology platform.

Principal Engineer Software

Infineon Hybrid Irvine, California, United States

VIEW

Posted 22 hours ago

Experienced Principal Engineer Software sought to lead innovation in Bluetooth technology within Infineon's IoT product portfolio.

Staff Software Engineer L6

Inovalon Hybrid Remote- United States

VIEW

Posted 8 hours ago

Experienced Staff Software Engineer needed to lead backend development of scalable healthcare applications at a data-driven, technology-focused company.

iOS Engineer, App Platform

The New York Times Hybrid New York, New York, United States

VIEW

Posted 53 minutes ago

Contribute as an iOS Engineer at The New York Times to develop foundational app platform technologies and deliver unified mobile experiences.

Senior Software Engineer | Data Platform

Ramp Hybrid New York City

VIEW

Posted 4 hours ago

Inclusive & Diverse

Collaboration over Competition

Growth & Learning

Transparent & Candid

Mission Driven

Diversity of Opinions

Empathetic

Fast-Paced

Rise from Within

Work/Life Harmony

Take Risks

Startup Mindset

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Employee Resource Groups

401K Matching

Paid Holidays

Paid Sick Days

Ramp seeks a Senior Software Engineer to develop scalable data platforms enabling machine learning and analytics for a leading fintech company.

Senior Software Engineer (Full Stack)

Oracle Hybrid Pleasanton, California, United States

VIEW

Posted 23 hours ago

Experienced Full Stack Software Engineer needed to develop robust cloud-native analytics platforms for a fast-growing business intelligence company.

Manager, Software Engineering - UI Platform & Design Systems

Figma Hybrid Remote

VIEW

Posted 1 hour ago

Empathetic

Collaboration over Competition

Growth & Learning

Passion for Exploration

Fast-Paced

Startup Mindset

Diversity of Opinions

Rise from Within

Lead and grow the UI Platform & Design Systems team at Figma to craft beautiful, accessible, and reliable user interfaces across the product suite.

iOS Platform Engineer

Patreon Hybrid No location specified

VIEW

Posted 3 hours ago

Inclusive & Diverse

Transparent & Candid

Growth & Learning

Diversity of Opinions

Mission Driven

Customer-Centric

Rapid Growth

Dare to be Different

Collaboration over Competition

Patreon is seeking a Staff iOS Platform Engineer to lead the development and evolution of their iOS architecture and design system in a hybrid work environment based in New York or San Francisco.