Job details

AI Engineer III

Mission

We’re reimagining instant shopping so technology never stands in the way, it accelerates you exponentially toward your goal by forming a deep connection with your needs and desires. In the Personal Superintelligence Lab, you will lead the design and deployment of agentic AI that reasons over rich, real‑world context and constraints, grounded in up‑to‑the‑minute knowledge and leveraging our unparalleled delivery speed. Your work will push the state of the art in alignment, grounding, and multi‑agent orchestration—while landing breakthroughs safely and at scale in production.

Scope and Impact

As an AI Engineer III, you will be the technical lead across context engineering, RLHF/RLVR and low‑latency serving. You’ll define the architecture, standards, and evaluation strategy that connect research to real‑world lift. You’ll mentor colleagues, influence cross‑functional roadmaps, and ship systems that deliver measurable improvements to core customer and business outcomes—without disclosing competitive intelligence.

Areas of Leadership and Contribution

Advanced Context & Grounding Research:

- Set the strategy for context engineering to maximize precision/recall of key order metrics across sessions, households, locales, and time.

- Architect multi‑modal context integration (temporal, spatial, behavioral) and real‑time grounding with dynamic constraint satisfaction.

- Establish retrieval freshness, geo/time‑aware constraints, and memory policies; formalize context schemas and data contracts.

- Champion declarative prompt/program compilation (e.g., DSPy) for systematic, testable LLM behavior.

- Design multi‑agent orchestration patterns (e.g., graph‑based agents via LangChain/LangGraph, CrewAI, AutoGen, LlamaIndex) that yield robust emergent reasoning.

Alignment and learning Systems:

- Lead supervised reasoning-centered fine‑tuning with rigorous data curation, synthetic data generation, and QA; institute golden sets and rubric/pairwise evals.

- Own the reasoning architecture and evaluation strategy—planning, tool selection, reflection, and uncertainty-aware decision-making—to deliver robust, low-latency, grounded outcomes at scale.

- Drive parameter‑efficient adaptation strategies (LoRA/QLoRA and text-to-LoRA) with clear criteria for when to specialize vs. generalize.

- Architect RLHF and RLVR pipelines; build preference data loops, scalable oversight, and guardrails.

- Own policy optimization strategy: expert use of DPO/PPO/GRPO/GSPO and advancement beyond them (constrained optimization, regularized objectives, KL‑control) with formal safety considerations.

- Ensure robust offline‑to‑online correlation via counterfactual/IPS/DR estimators and stress tests across traffic segments.

Safety, robustness, and privacy:

- Establish interpretability, controllability, and alignment verification practices for agentic systems.

- Develop safeguards against reward hacking and unsafe exploration; enforce distributional robustness and content policy compliance.

- Advance privacy‑preserving methods (data minimization, federated/on‑device learning where appropriate) with privacy‑by‑design.

Systems, serving, and evaluation at scale:

- Architect low‑latency, cost‑efficient inference (quantization, caching, batching, streaming) with resilient fallbacks and red‑teaming.

- Build eval frameworks that tightly couple offline metrics with online performance and safety criteria; define promotion gates.

- Use relevant APIs to perform high‑fidelity data augmentation that strengthens grounding, disambiguation, and availability‑aware suggestions.

Experimentation and cross‑functional impact:

- Partner closely with Engineering and Data Science to design experiments, define success criteria, and iterate quickly from signal to lift.

- Translate ambiguous product goals into crisp technical milestones; maintain clear documentation, incident response, and learning playbooks.

- Mentor colleagues; raise the bar on design quality, reproducibility, and ethical rigor.

Requirements:

PhD in Computer Science, Machine Learning, or equivalent research experience with significant contributions to AI/ML literature.
7+ years of building and shipping large‑scale ML systems with significant ownership; proven impact in production LLM or RL‑driven products.
Mastery of advanced fine-tuning techniques including LoRA/QLoRA, adapter methods, and parameter-efficient transfer learning.
Research experience with agentic AI frameworks, multi-agent systems, and declarative programming approaches (DSPy, LangChain ecosystem).
Strong systems engineering capabilities with PyTorch, distributed training, and cloud-native ML infrastructure.
Track record of publications in top-tier venues (NeurIPS, ICML, ICLR, AAAI) or equivalent industry impact.
Deep expertise in transformer architectures, SFT, and RLHF; hands‑on leadership with RLVR and verifiable reward design.
Mastery of policy optimization (DPO/PPO/GRPO/GSPO) and the ability to extend/regularize policies under safety, latency, and cost constraints.
Strong grounding in offline evaluation, counterfactual estimators, and safe online ramp strategies.
Systems fluency: PyTorch, distributed training, low‑latency serving, observability, and cloud‑native ML infra.
Demonstrated leadership across cross‑functional teams, with clear communication and mentoring track record.
Commitment to responsible AI: privacy, safety, and alignment principles embedded end‑to‑end.

Preferred Qualifications:

Research or applied work in multi‑agent systems, decision theory, or declarative programming (e.g., DSPy).
Experience with formal methods for safety, program synthesis, or automated reasoning.
Contributions to open‑source AI frameworks or foundational model development.
Experience with privacy‑enhancing technologies, federated/on‑device learning, or identity/memory architectures.

Tooling and Stack:

Fine‑tuning: Unsloth for rapid prototyping; TRL for RLHF/RLVR workflows and policy optimization.
Retrieval, evaluation, and orchestration: pragmatic use of graph‑based agent frameworks and vector retrieval systems as appropriate.

What We Offer:

A front row place in redefining instant shopping with personal superintelligence deployed at massive scale.
Deep collaboration with exceptional researchers and engineers; publication support where appropriate.
Access to world‑class compute, datasets, and experimentation infrastructure.
Competitive compensation with meaningful upside tied to breakthrough AI applications.

Compensation:

Gopuff pays employees based on market pricing and pay may vary depending on your location. The salary range below reflects what we’d reasonably expect to pay candidates. A candidate’s starting pay will be determined based on job-related skills, experience, qualifications, interview performance, and market conditions. These ranges may be modified in the future. Exceptions may be made for exceptional individuals. For additional information on this role’s compensation package, please reach out to the designated recruiter for this role.
This role is eligible for a discretionary annual cash bonus and participation in Gopuff’s equity incentive plan.
Base Salary Range: $215,000 - $275,000

Benefits Overview:

Medical/Dental/Vision Insurance
401(k) Retirement Savings Plan
HSA or FSA eligibility
Long and Short-Term Disability Insurance
Mental Health Benefits
Fitness Reimbursement Program
25% employee discount & FAM Membership
Flexible PTO
Group Life Insurance
EAP through AllOne Health (formerly Carebridge)

The only predictable thing about life is that it’s wildly unpredictable. That’s where we come in.

When life does what it does best, customers turn to Gopuff to deliver their everyday essentials, and to get through their day & night, work day and weekend.

We’re assembling a team of thinkers, dreamers & risk-takers...the kind of people who know the value of peace of mind in an unpredictable world. (And people who love snacks.)

Like what you’re hearing? Welcome to Gopuff.

The Gopuff Fam is committed to an inclusive workplace where we do not discriminate on the basis of race, sex, gender, national origin, religion, sexual orientation, gender identity, marital or familial status, age, ancestry, disability, genetic information, or any other characteristic protected by applicable laws. We believe in diversity and encourage any qualified individual to apply. We are an equal employment opportunity employer.

#LI-GOPUFF

AI Engineer LLM RLHF RLVR LoRA QLoRA PyTorch PolicyOptimization DPO PPO MultiAgent LangChain Retrieval Quantization LowLatencyServing

Gopuff Glassdoor Company Review

3.1

Gopuff DE&I Review

3.4

CEO of Gopuff

Rafael Ilishayev / Yakir Gola

Approve of CEO

Average salary estimate

$245000 / YEARLY (est.)

min

max

$215000K

$275000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

.NET Developer (Bilingual - Korean)

SBT Global, Inc. Hybrid San Diego, CA, USA

VIEW

Posted 8 hours ago

Bilingual English/Korean .NET Developer needed in San Diego to design, develop, and maintain internal C#/ASP.NET/.NET Core applications.

Software Engineer

Blue Tiger Hybrid No location specified

VIEW

Posted 23 hours ago

Full-Stack Software Engineer to design and build scalable, cloud-based CDS applications and EHR integrations for the VA as part of Blue Tiger's remote civic technology team.

Engineering Student Intern, Summer 2026 (Remote)

The Athletic Media Company Hybrid United States

VIEW

Posted 10 hours ago

A paid, remote summer engineering internship at The Athletic where students will build production features with mentor support and gain hands-on experience in modern web technologies.

Senior AI Engineer (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 13 hours ago

Lead the design and deployment of scalable AI and search infrastructure for a fast-growing data intelligence platform, driving performance, reliability, and production readiness across multi-cloud environments.

Staff Mobile Developer (iOS/Flutter) - E-Learning

Truelogic Hybrid No location specified

VIEW

Posted 12 hours ago

A remote Staff Mobile Developer (iOS/Flutter) role building delightful, reliable features for a cultural language-learning app backed by a nearshore partner.

Principal Product Engineer

FarmRaise Hybrid No location specified

VIEW

Posted 13 hours ago

FarmRaise is hiring a Principal Product Engineer to lead AI-enabled product architecture and execution across the stack for a remote-first team focused on sustainability and farm finance outcomes.

iOS Developer

Zone IT Solutions Hybrid No location specified

VIEW

Posted 19 hours ago

Zone IT Solutions seeks an experienced iOS Developer in New York to build and maintain client-facing iOS applications using Swift and Objective-C.

AI LLM Retrieval Engineer

Titan America Hybrid United States

VIEW

Posted 24 hours ago

Titan is hiring an AI LLM Retrieval Engineer to design and operate production retrieval pipelines, embedding strategies, and vector search solutions for secure, bank-focused AI systems.

Tech Lead, Infrastructure & Data

Lightfield Hybrid San Francisco

VIEW

Posted 18 hours ago

Lead the design and implementation of Lightfield’s scalable data infrastructure, search, and retrieval systems to enable AI-driven CRM capabilities.

Software Development - Space Weather

Peraton Hybrid Colorado Springs

VIEW

Posted 10 hours ago

Peraton seeks a Senior Java Developer in Colorado Springs to lead backend development and integrate the Space Weather Analysis Forecasting Service into a secure GovCloud environment.

Agentic AI Software Engineer

Bah Hybrid Chantilly, VA

VIEW

Posted 16 hours ago

Create and deploy production AI/ML solutions, including LLMs and agent frameworks, for mission-critical client systems at Booz Allen in Chantilly, VA.

Senior Software Engineer

City of New York Hybrid New York, NY

VIEW

Posted 4 hours ago

DCP’s ITD ASM unit is hiring Senior Software Engineers to lead full-stack .NET development, Azure-based solutions, and application modernization projects for city planning systems.

Senior Software Engineer (C# and .NET) (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 18 hours ago

An experienced C#/.NET Senior Software Engineer is sought to lead full-stack development of enterprise web applications in a fully remote US role.

Gopuff

Gopuff is a consumer goods and food delivery company headquartered in Philadelphia. As of October 2021, we operate in more than 650 US cities through approximately 500 microfulfillment centers nationwide.

9 jobs

MATCH

Calculating your matching score...

BADGES