About the Role
As Member of Technical Staff (Inference), you’ll push the limits of frameworks, refine our agent architecture, and build the benchmarks that define performance at scale. You’ll own the systems that take our frontier models from the lab into lightning-fast production-ready services.
This is not a maintenance role — you’ll be experimenting with the latest serving research, optimizing for every millisecond, and shipping infrastructure that our researchers and products depend on daily.
Responsibilities
Architect and optimize high-performance inference infrastructure for large foundation models
Benchmark and improve latency, throughput, and agent responsiveness
Work with researchers to deploy new model architectures and multi-step agent behaviors
Implement caching, batching, and prioritization to handle high-volume requests
Build monitoring and observability into inference pipelines
Qualifications
Strong experience in distributed systems and low-latency ML serving
Skilled with performance optimization tools and techniques, and experienced in developing solutions for critical performance gains
Hands-on with vLLM, SGLang, or equivalent frameworks
Familiarity with GPU optimization, CUDA, and model parallelism
Comfort working in a high-velocity, ambiguity-heavy startup environment
What makes us interesting
Small, elite team of ex-founders, researchers from top AI Labs, top CS grads, and engineers from top companies
True ownership You will not be blocked by bureaucracy, shipping meaningful work within weeks rather than months
Serious momentum We're well-funded by top investors, moving fast, and focused on execution
What we do
Ship consumer products powered by cutting-edge AI research, and
Build infrastructure that facilitates research and product, and
Innovate cutting-edge research that will open up new consumer product forms
The Details
Full-time, onsite role in Menlo Park
Startup hours apply
Generous salary, with additional benefits to be discussed during the hiring process
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Technical interns are invited to engage in impactful AI research and engineering projects within a well-funded, innovative Menlo Park startup.
Innovative ecommerce search platform Constructor seeks a skilled full-stack engineer focused on backend development to enhance merchandiser tools and scale features.
Lead the QA strategy and development of Attentive’s mobile SDKs, driving quality and innovation for a top AI-driven marketing platform.
Warner Bros. Discovery is seeking a Software Engineer II to advance AI-driven media supply chain solutions at their New York office.
Aretum seeks an experienced Angular Developer to create user-centric, high-performance frontend applications for mission-critical federal projects in a fully remote role.
Experienced Senior Staff Software Engineer sought to lead hypervisor virtualization R&D for Crusoe’s cutting-edge, AI-focused cloud infrastructure.
Contribute as a Software Engineer III at Morgan Stanley, creating innovative financial technology solutions that impact markets worldwide.
Drive next-generation automation and CI/CD innovations for robotic-assisted surgery platforms as a Staff Software Engineer at Intuitive.
An innovative startup is looking for a Full Stack Engineer experienced in TypeScript and backend systems to drive AI-powered product development remotely across the US.
Solventum is hiring a Senior DBA Product Performance Engineer to lead database performance tuning and performance testing efforts for healthcare solutions.
TrainingPeaks seeks a skilled Senior Software Engineer to build scalable backend systems that empower endurance athletes and coaches worldwide.
Innovate and architect scalable Java backend solutions at Aretum, supporting critical federal government missions with a dedicated and growth-focused team.
Capital One is looking for passionate Software Engineers proficient in Python, TypeScript, and AWS to build innovative financial technology solutions.
Drive innovation and lead data engineering teams at Capital One to deliver scalable, customer-focused software and data products.