We believe the future is adaptable, and not one-size-fits-all. We will lead in real-time efficient adaptation that combines algorithm with innovative interface design. Our global team—based in SF and beyond—brings together top talent in AI innovation. Backed by world-class investors, we're building Adaptable Intelligence.
You'll be one of our first engineering hires, working directly with our founders to set up the core inference systems that will power our product, including deployment pipelines, model serving, observability, and cloud infrastructure for large language models. You thrive in zero-to-one environments and enjoy owning the full lifecycle of LLM inference design and implementation. You’re comfortable wearing multiple hats, making pragmatic technical decisions, and laying down scalable, secure foundations for future growth of our LLM inference capabilities.
Build from zero to one: design and implement our entire LLM inference infrastructure, making critical architectural decisions for scalability and performance.
Own the inference stack: deploy, optimize, and maintain high-throughput, low-latency inference systems serving millions of requests.
Framework expertise: leverage frameworks like vLLM, SGLang, or similar to maximize inference efficiency and cost-effectiveness.
Performance optimization: fine-tune model serving configurations, implement batching strategies, and optimize GPU utilization.
Infrastructure scaling: design auto-scaling systems that can handle variable traffic patterns while controlling costs.
Monitoring & reliability: build comprehensive observability into our inference pipeline with proper alerting and incident response.
Cross-functional collaboration: work closely with our ML and product teams to understand requirements and deliver optimal serving solutions.
Proven 0→1 experience: You've previously built LLM inference systems from scratch in a production environment
Framework proficiency: Hands-on experience with modern inference frameworks (vLLM, SGLang, TensorRT-LLM, or similar)
Infrastructure expertise: Strong background in distributed systems, containerization (Docker/Kubernetes), and cloud platforms (AWS/GCP/Azure)
Performance mindset: Experience optimizing inference latency, throughput, and cost at scale
Production experience: You've deployed and maintained ML systems serving real users in production
Experience in a fast-paced startup environment
Contributions to open-source inference tools and frameworks
Experience with model quantization, pruning, or other optimization techniques
Knowledge of CUDA programming and GPU optimization
Experience serving multi-modal models (vision, audio, etc.)
Competitive salary + meaningful equity
Learning and development budget to support your growth as you adapt
Comprehensive medical benefits and generous PTO
Annual travel stipend to explore somewhere new—because building global technology means staying adaptable to new places and perspectives
Mission-driven team shaping the future of intelligence, where you'll enjoy high ownership and the opportunity to make a career-defining impact
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Forgepoint seeks a product-minded Founding Engineer to ship end-to-end AI-driven features (TypeScript stack) and help build the company and product from the ground floor in NYC.
Provide remote phone-based reception and administrative support for Digital Academy of Florida, ensuring precise call routing, documentation, and excellent family-facing customer service.
Visa is hiring an Associate Cybersecurity Engineer (local, hybrid) to develop Python-based agentic and automation solutions that enhance detection, prevention, and response across Visa's security ecosystem.
Senior Java Developer (bilingual Korean) needed to design and maintain scalable backend services, optimize Oracle-based data systems, and collaborate with cross-functional teams on-site in Irvine.
Lead the architecture and implementation of Unreal-based cross-platform multiplayer systems at Jackbox Games, shaping how gameplay, sessions, and player identity work across platforms.
Blackbird Labs is hiring a Senior Backend Engineer in New York to build secure, scalable backend systems that integrate blockchain payments and on-chain wallets with traditional payment rails.
Lead the reliability, performance, and automation of Visa's mission-critical databases across PostgreSQL, Oracle, and MySQL to ensure high availability and a great developer experience.
Lead a team of engineers to design, build, and deliver pipeline tooling for real-time virtual production and stage workflows at Eyeline Studios.
Senior technical leader needed to lead C++ software teams building Air-to-Air sensor and systems software for national security applications.
Lead StackCommerce's engineering transformation as a hands-on technology leader overseeing a distributed team to consolidate platforms, improve security, and drive a unified commerce architecture.
Lead firmware and embedded Linux development for SPAN’s hardware products, owning architecture, third-party integrations, and delivery across the device software stack.
Join Visa’s technology organization as a Software Engineer to build scalable payment systems and services used by millions worldwide.
Velera is hiring an expert Software Engineer IV (C#/React.js) to lead full-stack .NET development and architecture for cloud-based fintech solutions in a remote capacity.
Lead engineering efforts to build and scale an AI-driven privacy and data governance platform for enterprise customers while mentoring teams and shaping architecture toward a Director role.
Lead reliability and automation efforts for Crusoe's SDN stack, ensuring high-performance, fault-tolerant networking for an AI-first cloud platform.
SpringRole is the first professional reputation network powered by artificial intelligence and blockchain to eliminate fraud from user profiles. Because SpringRole is built on blockchain and uses smart contracts, it's able to verify work experienc...
513 jobs