Job details

Research Scientist

Research Scientist / Machine Learning Scientist

Location: SF Bay Area/Hybrid / Remote

Type: Full-Time

About the Role:

The Client is seeking a variety of Machine Learning Scientist to help advance how we evaluate and understand AI models. You’ll help design and analyze experiments that uncover what makes models useful, trustworthy and capable through human preference signals. Your work will contribute to the scientific foundations of understanding AI at scale.

This role is deeply interdisciplinary. You’ll work closely with engineers, product teams, marketing and the broader research community to develop new methods for comparing models, analyzing preference data, and disentangling performance factors like style, reasoning, and robustness. Your work will inform both the public leaderboard and the tools we provide to model developers.

If you’re excited by open-ended questions, rigorous evaluation, and research that’s grounded in real-world impact, you’ll find a meaningful home here. We’re looking for:

• Hands-on experience training large-scale models, including reward models, preference models, and fine-tuning LLMs with methods like RLHF, DPO, and contrastive learning.

• Strong foundation in ML and statistics, with a track record of designing novel training objectives, evaluation schemes, or statistical frameworks to improve model reliability and alignment.

• Fluent in the full experimental stack, from dataset design and large-batch training to rigorous evaluation and ablation, with an eye for what scales to production.

• Deeply collaborative mindset, working closely with engineers to productionize research insights and iterating with product teams to align modeling goals with user needs.

Responsibilities:

• Design and conduct experiments to evaluate AI model behavior across reasoning, style, robustness, and user preference dimensions

• Develop new metrics, methodologies, and evaluation protocols that go beyond traditional benchmarks

• Analyze large-scale human voting and interaction data to uncover insights into model performance and user preferences

• Collaborate with engineers to implement and scale research findings into production systems

• Prototype and test research ideas rapidly, balancing rigor with iteration speed

• Author internal reports and external publications that contribute to the broader ML research community

• Partner with model providers to shape evaluation questions and support responsible model testing

• Contribute to the scientific integrity and transparency of the The Client leaderboard and tools

Who is The Client?

Created by researchers from UC Berkeley’s SkyLab, The Client is an open platform where everyone can easily access, explore and interact with the world’s leading AI models. By comparing them side by side and casting votes for the better response, the community helps shape a public leaderboard, making AI progress more transparent, and grounded in real-world usage.

Why Join Us?

Trusted by organizations like Google, OpenAI, Meta, xAI, and more, The Client is rapidly becoming essential infrastructure for transparent, human-centered AI evaluation at scale. With over one million monthly users and growing developer adoption, our impact is helping guide the next generation of safe, aligned AI systems—grounded in open access and collective feedback.

Our work is regularly referenced by industry leaders pushing the frontier of safe and reliable AI. Sundar Pichai, Jeff Dean, Elon Musk, and Sam Altman.

• High Impact: Your work will be used daily by the world’s most advanced AI labs.

• Global Reach: Develop data infrastructure powering millions of real-world evaluations, influencing AI reliability across industries at the top-tier

• Exceptional Team: We are a small team of top talent from Google, DeepMind, Discord, Vercel, UC Berkeley, and Stanford.

Requirements:

• PhD or equivalent research experience in Machine Learning, Natural Language Processing, Statistics, or a related field

• Strong understanding of LLMs and modern deep learning architectures (e.g., Transformers, diffusion models, reinforcement learning with human feedback)
Proficiency in Python and ML research libraries such as PyTorch, JAX, or TensorFlow

• Demonstrated ability to design and analyze experiments with statistical rigor

• Experience publishing research or working on open-source projects in ML, NLP, or AI evaluation

• Comfortable working with real-world usage data and designing metrics beyond standard benchmarks

• Ability to translate research questions into practical systems and collaborate across engineering and product teams

• Passion for open science, reproducibility, and community-driven research

What we offer:

• The cash compensation for this position has not yet been finalized. Actual compensation will depend on job-related knowledge, skills, experience, and candidate location.

• Competitive salary and meaningful equity

• Comprehensive healthcare coverage (medical, dental, vision)

• The opportunity to work on cutting-edge AI with a small, mission-driven team

• A culture that values transparency, trust, and community impact

Come help build the space where anyone can explore and help shape the future of AI.

Research Scientist Machine Learning Scientist LLMs RLHF DPO PyTorch JAX TensorFlow Model evaluation Human preference NLP Statistics Experimentation Ablation Benchmarking

Average salary estimate

$210000 / YEARLY (est.)

min

max

$160000K

$260000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Archaeology Technician

AECOM Hybrid Los Angeles, California, United States

VIEW

Posted 6 hours ago

AECOM seeks a part-time Archaeology Technician in Los Angeles to perform archaeological fieldwork, data collection, and monitoring on environmental and infrastructure projects.

Member of Technical Staff - Research Scientist

Virtue Group Hybrid San Francisco

VIEW

Posted 10 hours ago

Virtue AI is hiring a Research Scientist to design and implement agent and LLM red‑teaming techniques and production-ready guardrail models to advance its AI security platform.

Associate Chemist - Environment Testing - 2nd Shift, M-F

Eurofins Hybrid Lancaster, PA, USA

VIEW

Posted 5 hours ago

Eurofins Lancaster Labs is hiring an Associate Chemist for 2nd shift to operate GC/MS volatiles analyses, maintain instrumentation, and produce high-quality environmental testing data.

Part-Time Laboratory Assistant - Malik Lab

Howard Hughes Medical Institute Hybrid Fred Hutchinson Cancer Center

VIEW

Posted 19 hours ago

HHMI is hiring a part-time Laboratory Assistant for the Malik lab at Fred Hutch to support fly genetics research through lab maintenance, media preparation, and inventory coordination.

Senior Paleontologist

AECOM Hybrid Los Angeles, CA, United States

VIEW

Posted 18 hours ago

AECOM is hiring a Senior Paleontologist in Southern California to lead field monitoring, prepare CEQA/NEPA paleontological studies and technical reports, and provide expert fossil identification and project guidance.

Research Assistant - Dong Lab

Kirmayer Fitness Center Hybrid Kansas City Metro Area

VIEW

Posted 21 hours ago

The Dong Lab at the University of Kansas Medical Center is hiring a Research Assistant to perform microbiology and animal-model research, manage lab operations, and support data generation for publications and grants.

Senior Director Research Operations II

CCF Hybrid Cleveland Clinic Main Campus

VIEW

Posted 22 hours ago

The Senior Director, Research Operations II will lead strategic, financial and operational management for the Lerner Research Institute's Inflammation and Immunity research areas, driving compliance, efficiency and growth.

CLINICAL RESEARCH COORDINATOR B/C

Penn Medicine Hybrid Perelman Center for Adv Medicine

VIEW

Posted 23 hours ago

The Abramson Cancer Center is hiring a Clinical Research Coordinator B/C to coordinate complex oncology clinical trials within the Cell Therapy & Transplant Research Team at Penn Medicine.

Global Customer Insights Intern

Intuitive Hybrid Atlanta, GA

VIEW

Posted 20 hours ago

Support Intuitive's Global Customer Insights team by conducting primary market research, analyzing results, and preparing reports to reduce business decision risk.

Clinical Research Assistant - Future of Medicine, US Based (Mobile, AL)

Care Access Hybrid Mobile, AL

VIEW

Posted 10 hours ago

Care Access is hiring an entry-level Clinical Research Assistant in Mobile, AL to perform phlebotomy, specimen processing, participant-facing activities, and data management for community-based clinical trials.

Senior Project Archaeologist

AECOM Hybrid Germantown, MD

VIEW

Posted 18 hours ago

AECOM is hiring a Senior Project Archaeologist to lead archaeological field and compliance projects, manage crews, and produce technical reports supporting federal, state, and commercial clients.

Entry-Level Environmental Scientists - Networking Event with AECOM – Philadelphia

AECOM Hybrid Bloomfield, NJ, United States

VIEW

Posted 23 hours ago

AECOM is recruiting entry-level Environmental Scientists to support ecological field surveys, data management, and permitting work across the Mid-Atlantic region beginning Spring/Summer 2026.

Research Director/Principal Researcher

PowerSchool Hybrid Remote

VIEW

Posted 3 hours ago

Lead and grow high-impact K-12 education research projects at PowerSchool as a senior researcher who combines advanced quantitative methods, stakeholder engagement, and team mentorship to influence products and policy.