Mission
We’re reimagining instant shopping so technology never stands in the way, it accelerates you exponentially toward your goal by forming a deep connection with your needs and desires. In the Personal Superintelligence Lab, you will lead the design and deployment of agentic AI that reasons over rich, real‑world context and constraints, grounded in up‑to‑the‑minute knowledge and leveraging our unparalleled delivery speed. Your work will push the state of the art in alignment, grounding, and multi‑agent orchestration—while landing breakthroughs safely and at scale in production.
Scope and Impact
As an AI Engineer III, you will be the technical lead across context engineering, RLHF/RLVR and low‑latency serving. You’ll define the architecture, standards, and evaluation strategy that connect research to real‑world lift. You’ll mentor colleagues, influence cross‑functional roadmaps, and ship systems that deliver measurable improvements to core customer and business outcomes—without disclosing competitive intelligence.
Areas of Leadership and Contribution
Advanced Context & Grounding Research:
- Set the strategy for context engineering to maximize precision/recall of key order metrics across sessions, households, locales, and time.
- Architect multi‑modal context integration (temporal, spatial, behavioral) and real‑time grounding with dynamic constraint satisfaction.
- Establish retrieval freshness, geo/time‑aware constraints, and memory policies; formalize context schemas and data contracts.
- Champion declarative prompt/program compilation (e.g., DSPy) for systematic, testable LLM behavior.
- Design multi‑agent orchestration patterns (e.g., graph‑based agents via LangChain/LangGraph, CrewAI, AutoGen, LlamaIndex) that yield robust emergent reasoning.
Alignment and learning Systems:
- Lead supervised reasoning-centered fine‑tuning with rigorous data curation, synthetic data generation, and QA; institute golden sets and rubric/pairwise evals.
- Own the reasoning architecture and evaluation strategy—planning, tool selection, reflection, and uncertainty-aware decision-making—to deliver robust, low-latency, grounded outcomes at scale.
- Drive parameter‑efficient adaptation strategies (LoRA/QLoRA and text-to-LoRA) with clear criteria for when to specialize vs. generalize.
- Architect RLHF and RLVR pipelines; build preference data loops, scalable oversight, and guardrails.
- Own policy optimization strategy: expert use of DPO/PPO/GRPO/GSPO and advancement beyond them (constrained optimization, regularized objectives, KL‑control) with formal safety considerations.
- Ensure robust offline‑to‑online correlation via counterfactual/IPS/DR estimators and stress tests across traffic segments.
Safety, robustness, and privacy:
- Establish interpretability, controllability, and alignment verification practices for agentic systems.
- Develop safeguards against reward hacking and unsafe exploration; enforce distributional robustness and content policy compliance.
- Advance privacy‑preserving methods (data minimization, federated/on‑device learning where appropriate) with privacy‑by‑design.
Systems, serving, and evaluation at scale:
- Architect low‑latency, cost‑efficient inference (quantization, caching, batching, streaming) with resilient fallbacks and red‑teaming.
- Build eval frameworks that tightly couple offline metrics with online performance and safety criteria; define promotion gates.
- Use relevant APIs to perform high‑fidelity data augmentation that strengthens grounding, disambiguation, and availability‑aware suggestions.
Experimentation and cross‑functional impact:
- Partner closely with Engineering and Data Science to design experiments, define success criteria, and iterate quickly from signal to lift.
- Translate ambiguous product goals into crisp technical milestones; maintain clear documentation, incident response, and learning playbooks.
- Mentor colleagues; raise the bar on design quality, reproducibility, and ethical rigor.
The only predictable thing about life is that it’s wildly unpredictable. That’s where we come in.
When life does what it does best, customers turn to Gopuff to deliver their everyday essentials, and to get through their day & night, work day and weekend.
We’re assembling a team of thinkers, dreamers & risk-takers...the kind of people who know the value of peace of mind in an unpredictable world. (And people who love snacks.)
Like what you’re hearing? Welcome to Gopuff.
The Gopuff Fam is committed to an inclusive workplace where we do not discriminate on the basis of race, sex, gender, national origin, religion, sexual orientation, gender identity, marital or familial status, age, ancestry, disability, genetic information, or any other characteristic protected by applicable laws. We believe in diversity and encourage any qualified individual to apply. We are an equal employment opportunity employer.
#LI-GOPUFF
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Bilingual English/Korean .NET Developer needed in San Diego to design, develop, and maintain internal C#/ASP.NET/.NET Core applications.
Full-Stack Software Engineer to design and build scalable, cloud-based CDS applications and EHR integrations for the VA as part of Blue Tiger's remote civic technology team.
A paid, remote summer engineering internship at The Athletic where students will build production features with mentor support and gain hands-on experience in modern web technologies.
Lead the design and deployment of scalable AI and search infrastructure for a fast-growing data intelligence platform, driving performance, reliability, and production readiness across multi-cloud environments.
A remote Staff Mobile Developer (iOS/Flutter) role building delightful, reliable features for a cultural language-learning app backed by a nearshore partner.
FarmRaise is hiring a Principal Product Engineer to lead AI-enabled product architecture and execution across the stack for a remote-first team focused on sustainability and farm finance outcomes.
Zone IT Solutions seeks an experienced iOS Developer in New York to build and maintain client-facing iOS applications using Swift and Objective-C.
Titan is hiring an AI LLM Retrieval Engineer to design and operate production retrieval pipelines, embedding strategies, and vector search solutions for secure, bank-focused AI systems.
Lead the design and implementation of Lightfield’s scalable data infrastructure, search, and retrieval systems to enable AI-driven CRM capabilities.
Peraton seeks a Senior Java Developer in Colorado Springs to lead backend development and integrate the Space Weather Analysis Forecasting Service into a secure GovCloud environment.
Create and deploy production AI/ML solutions, including LLMs and agent frameworks, for mission-critical client systems at Booz Allen in Chantilly, VA.
DCP’s ITD ASM unit is hiring Senior Software Engineers to lead full-stack .NET development, Azure-based solutions, and application modernization projects for city planning systems.
An experienced C#/.NET Senior Software Engineer is sought to lead full-stack development of enterprise web applications in a fully remote US role.
Gopuff is a consumer goods and food delivery company headquartered in Philadelphia. As of October 2021, we operate in more than 650 US cities through approximately 500 microfulfillment centers nationwide.
9 jobs