Job details

Applied Research - Evals & Data

Building Open Superintelligence Infrastructure

Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups, and enterprises to run end-to-end reinforcement learning at frontier scale, adapting models to real tools, workflows, and deployment contexts.

We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.

Role Impact

This is a customer facing role at the intersection of cutting-edge RL/post-training methods, applied data, and agent systems. You’ll have a direct impact on shaping how advanced models are aligned, evaluated, deployed, and used in the real world by:

Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety.
Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals.
Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping.
Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities.

Customer-Facing Engineering

Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks.
Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams.
Translate customer insights and evaluation results into roadmap and research direction.

Post-training & Reinforcement Learning

Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks.
Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows.
Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities.
Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions.

Agent Development & Infrastructure

Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making.
Extend and integrate with agent frameworks to support evolving feature requests and performance requirements.
Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency.
Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments.

Requirements

Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment.
Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines).
Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate).
Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform).
Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL.
Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems.

What We Offer

Competitive Compensation + equity incentives
Flexible Work (remote or San Francisco)
Visa Sponsorship & relocation support
Professional Development budget
Team Off-sites & conference attendance

Growth Opportunity

You’ll join a mission-driven team working at the frontier of open, superintelligence infra. In this role, you’ll have the opportunity to:

Shape the evolution of agent-driven, data-informed solutions—from research breakthroughs to production systems used by real customers.
Collaborate with leading researchers, engineers, and partners pushing the boundaries of RL, evaluation, and post-training.
Grow with a fast-moving organization where your contributions directly influence both the technical direction and the broader AI ecosystem.

If you’re excited to move fast, build boldly, and help define how agentic AI is developed and deployed, we’d love to hear from you.

Ready to build the open superintelligence infrastructure of tomorrow?
Apply now to help us make powerful, open AGI accessible to everyone.

Reinforcement Learning RLHF RLVR GRPO Agent Evaluation vLLM Ray Kubernetes Docker Python Distributed Training Evaluation Pipelines Applied Research ML Engineer

Average salary estimate

$240000 / YEARLY (est.)

min

max

$180000K

$300000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Computational Chemist/Material Scientist - Metal Alloys

Radical AI Hybrid New York, NY

VIEW

Posted 12 hours ago

Radical AI is hiring a Computational Materials Scientist to lead DFT and multiscale simulations for metal alloys, enabling AI-driven materials discovery and high-throughput computational workflows.

Research and Development Specialist SME

PingWind Hybrid Multiple Locations

VIEW

Posted 11 hours ago

Experienced R&D Specialist SME needed to lead technology evaluation, prototyping, and transition activities in support of federal programs requiring Top Secret clearance and DoD 8140 certification.

Scientist/Senior Scientist, Immunoassay Products

Spear Bio Hybrid Woburn

VIEW

Posted 18 hours ago

Spear Bio is hiring a Scientist/Senior Scientist to lead immunoassay development and manufacturing support for its ultrasensitive SPEAR platform at the Woburn headquarters.

Senior Associate Scientist, Viral Vector Cell Line Development

Asimov Hybrid Boston, MA

VIEW

Posted 11 hours ago

Asimov is hiring a Senior Associate Scientist to lead lentiviral producer cell line development and viral characterization for next-generation therapeutic manufacturing.

Senior Applied Researcher, Audio Understanding

Cartesia Hybrid San Francisco

VIEW

Posted 7 hours ago

Lead the development and productionization of novel large-scale audio understanding models to enable robust, multimodal AI that reasons over long-duration audio.

Research Engineer, Synthetic Data

Cartesia Hybrid San Francisco

VIEW

Posted 10 hours ago

Cartesia seeks a Research Engineer (Synthetic Data) to design large-scale synthetic data generation and quality-control systems that accelerate next-gen multimodal AI research and production.

Evals Lead

Cartesia Hybrid San Francisco

VIEW

Posted 9 hours ago

Lead the design and implementation of evaluation frameworks and user-grounded studies that define how Cartesia measures reasoning, memory, and interactive intelligence in next-generation multimodal models.

Principal Research Scientist - Music

Spotify Hybrid New York, NY

VIEW

Posted 5 hours ago

Inclusive & Diverse

Empathetic

Take Risks

Transparent & Candid

Feedback Forward

Mission Driven

Collaboration over Competition

Work/Life Harmony

Maternity Leave

Paternity Leave

Snacks

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

401K Matching

Paid Sick Days

Paid Time-Off

Paid Volunteer Time

Lead cutting-edge research on generative music and audio at Spotify’s Artist-First AI Music lab, transforming research into artist-centered products at scale.

Sr. R&D Engineer

Penumbra Hybrid Alameda, CA

VIEW

Posted 12 hours ago

Experienced R&D engineer needed to lead design, testing, and implementation of medical devices and equipment at Penumbra's Alameda site.

CMC Manager

Retro Hybrid Redwood City, CA

VIEW

Posted 19 hours ago

Lead CMC coordination and clinical trial material planning to advance Retro’s autophagy-enhancing small-molecule programs into first-in-human studies.

P Prime Intellect

2 jobs

MATCH

Calculating your matching score...

FUNDING

Seed

DEPARTMENTS

Research & Development

SENIORITY LEVEL REQUIREMENT

Senior Level

TEAM SIZE

No info