Rise Jobs & Careers icon Model Evaluation Jobs

Browse 48 exciting jobs hiring in Model Evaluation now. Check out companies hiring such as The College Board, The Browser Company, Danaher in New Orleans, Providence, St. Petersburg.

Lead the College Board’s strategy and operational implementation of automated scoring and AI/NL measurement to ensure valid, fair, and scalable solutions across major assessment programs.

Photo of the Rise User
Posted 2 days ago

Lead and grow the ML engineering practice at The Browser Company to build and ship LLM-powered, privacy-aware features that personalize and improve Dia’s browser experience.

Photo of the Rise User

Lead the design and execution of evaluation frameworks for multimodal AI systems at Danaher to ensure performance, robustness and safety across life-sciences and diagnostics products.

USAA Full-Time BREMERTON, Washington
Sponsored
Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Fast-Paced
Growth & Learning
Medical Insurance
Dental Insurance
401K Matching
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Flex-Friendly

Netflix is hiring a Machine Learning Scientist (L4) to research and develop generative computer-vision and graphics models that will be integrated into production tools for media creation.

Photo of the Rise User
Posted 4 days ago

Product Manager, Model Behavior to define and elevate TTS/STT quality and evaluation at Cartesia, shaping how voice AI sounds, performs, and delights customers.

Photo of the Rise User
Posted 4 days ago

Cartesia is looking for a Post-Training Researcher to design and scale preference optimization, evaluation, and feedback-driven learning methods for multimodal foundation models.

Photo of the Rise User

Lead and scale a remote ML engineering team to design, deploy, and iterate on LLM-powered features that drive measurable product impact.

Photo of the Rise User

Blend seeks an experienced Associate Director of Product Management to lead discovery, specification, and iterative delivery of agentic AI products while aligning cross-functional teams to measurable outcomes.

Photo of the Rise User
Posted 4 days ago

Cartesia is hiring a senior Product Manager to define and lead the voice AI agent product area, building enterprise-grade speech-driven agents and evaluation standards using cutting-edge audio models.

Photo of the Rise User

Lead the development and scaling of LLM-driven product features as an Engineering Manager focused on ML strategy, team growth, and production-quality infrastructure in a remote-first, high-impact startup.

Sponsored
USAA Full-Time NASHVILLE, Tennessee
Sponsored
Photo of the Rise User

Applied Scientist Intern role at Wealth.com focused on building and productionizing LLM-, NLP- and CV-driven legal/financial AI assistants through hands-on model development and evaluation.

Photo of the Rise User

Lead the ML effort at Retell AI to build, evaluate, and deploy real-time LLM and audio models powering high-traffic voice agents.

Weekday AI Hybrid No location specified
Posted 6 days ago

Provide expert Biology knowledge to a generative AI research team by designing tasks, authoring guidelines, and evaluating model outputs in a long-term remote contract.

Photo of the Rise User

Hippocratic AI is hiring an RN Clinical Product Strategist in Palo Alto to turn nursing expertise into clinical evaluation, safety rails, and product guidance for healthcare LLMs.

Photo of the Rise User
Inclusive & Diverse
Mission Driven
Feedback Forward
Fast-Paced
Medical Insurance
Dental Insurance
Vision Insurance
Life insurance
Disability Insurance
Mental Health Resources
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Equity
Maternity Leave
Paternity Leave
Some Meals Provided
Snacks
Social Gatherings

Lyra Health is seeking a Senior Machine Learning Engineer to build production ML and generative-AI tooling, platforms, and services that support clinical mental-health products and scale across the organization.

Photo of the Rise User
Comcast Hybrid NY - New York, 105 Wooster Street
Posted 8 days ago

Senior Analyst, GenAI supporting Comcast's Enterprise BI GenAI Center of Excellence to build, evaluate and operationalize generative AI workflows for marketing and creative content.

Photo of the Rise User
Posted 9 days ago

Lead and scale an AI engineering organization to deliver production-ready foundation models and LLM-powered products that impact millions of users in a fully remote environment.

Photo of the Rise User
Posted 9 days ago

Director of AI Engineering needed to lead a remote team in designing, deploying, and optimizing large-scale, production AI systems and foundational model infrastructure.

Chicago Cubs are hiring Data Scientists to develop and deploy analytical models and data pipelines that inform player evaluation, development, and strategic decisions across Baseball Operations.

Photo of the Rise User
Mercor Hybrid San Francisco
Posted 10 days ago

Mercor is hiring an AI Researcher in San Francisco to lead LLM evaluation research, publish benchmark papers, and build dataset and annotation offerings for top AI labs.

USAA Full-Time CHARLOTTE, North Carolina
Sponsored
Photo of the Rise User
Mastercard Hybrid New York City, New York
Posted 11 days ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid

Lead strategy and roadmap for AI solutions in Mastercard’s AI Center of Excellence, translating ML capabilities into clear business outcomes while advising senior leaders.

Photo of the Rise User
Writer Hybrid New York City
Posted 11 days ago
Dare to be Different
Diversity of Opinions
Inclusive & Diverse
Collaboration over Competition
Fast-Paced
Growth & Learning

Customer-facing AI engineer focused on designing and operationalizing advanced prompts, prompt chains, and linguistic patterns to produce high-quality, formatted content for WRITER’s enterprise customers.

Lead and grow a hands-on ML systems engineering organization in Basis’s NYC office to build production multi-agent architectures, evaluation pipelines, and observability tooling that power the AI Accountant.

Posted 14 days ago

An acquired GenAI-native tax-document platform seeks an experienced Applied AI/ML Engineer to own and scale production ML systems that transform accounting workflows.

FurtherAI Hybrid San Francisco
Posted 14 days ago

FurtherAI is hiring a hands-on Data Scientist to lead evaluation, LLM tuning, and metrics for production AI systems supporting major insurance customers at our San Francisco headquarters.

Posted 14 days ago

At Rox, an Applied AI Engineer will build and deploy agentic LLM-powered workflows in production to supercharge revenue teams and iterate rapidly with customers and product partners.

Photo of the Rise User
Posted 15 days ago
Inclusive & Diverse
Feedback Forward
Collaboration over Competition
Growth & Learning

OpenAI is hiring a San Francisco-based software engineer to build evals, harnesses, and pipelines that drive model improvement and product reliability for advanced AI systems.

Posted 15 days ago

Lead the QA strategy for GenAI-powered e-learning features by designing prompt validation, HITL review workflows, and measurable evaluation protocols to ensure safe, reliable model behavior.

Photo of the Rise User
Posted 15 days ago
Inclusive & Diverse
Feedback Forward
Collaboration over Competition
Growth & Learning

Lead development of models, datasets, and evaluations that advance theoretical research in mathematics and related fields while partnering with academia and engineering teams.

Photo of the Rise User
Jobgether Hybrid No location specified
Posted 16 days ago

Work remotely as a Data Analyst producing and evaluating high-quality data and feedback to help improve AI systems for a US-based partner.

USAA Full-Time SAN ANTONIO, Texas
Sponsored
Sponsored
Posted 17 days ago

Epoch AI is hiring a remote Researcher to provide fast, rigorous analysis and forecasts across the AI pipeline for external partners and consultations.

Photo of the Rise User
Posted 17 days ago

Work remotely across AI projects to create, evaluate, and refine high-quality written data and prompts that improve model performance and user experience.

Posted 17 days ago

WitnessAI seeks a Machine Learning Engineer to design, evaluate, and deploy reliable, interpretable LLMs that power enterprise AI security and governance.

Posted 18 days ago

Lead the design, benchmarking, fine-tuning, and production deployment of specialized agents for enterprise customers using state-of-the-art RL and post-training techniques.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake AI is seeking a Senior AI Research Engineer to architect and scale large post-training and evaluation systems for LLMs and lead engineering efforts that translate research into production-grade benchmarks and pipelines.

Photo of the Rise User
Itron Hybrid United States of America, Texas, Austin
Posted 18 days ago

Itron is hiring an AI Data Science Analyst to support building and deploying AI-agent driven HR solutions that translate HR data into actionable insights.

Photo of the Rise User
Posted 18 days ago

Welo Data is hiring a Prompt Engineer & Data Analyst to design and evaluate prompts, curate datasets, and perform rigorous model analysis to enhance LLM capabilities and safety.

Photo of the Rise User
Posted 18 days ago

Contribute remotely to AI system improvement by creating, evaluating, and refining high-quality content and prompts that enhance model performance and user experience.

Photo of the Rise User

Lead the architecture and roadmap of Jiffy's core AI platform, driving model development, inference optimization, APIs, and agentic systems to power novel consumer and developer experiences.

Photo of the Rise User
Posted 20 days ago

Promise is seeking an AI Researcher to develop and operationalize ML and LLM solutions that streamline access to government benefits and improve service delivery.

USAA Full-Time PHOENIX, Arizona
Sponsored
Posted 21 days ago

Lead frontier LLM experimentation and productionize interpretable AI agent workflows as an early research scientist at DimRed.

Photo of the Rise User
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Fast-Paced
Growth & Learning
Medical Insurance
Dental Insurance
401K Matching
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Flex-Friendly

Lead a cross-disciplinary data science and ML team to deliver LLM-driven solutions, scalable pipelines, and enterprise analytics for Netflix's Content organization.

Posted 22 days ago

Lead high-impact, product-aligned experiments on foundation models using PyTorch and distributed training to improve real-world customer outcomes at Liquid AI.

Compa Hybrid No location specified
Posted 23 days ago

Lead and grow Compa’s inaugural Applied AI team, driving production ML systems and MLOps practices to power enterprise compensation intelligence.

Lead large-scale LLM training and synthetic data pipelines at Periodic Labs to build scientifically knowledgeable models and scale training across supercomputing infrastructure.

Photo of the Rise User
Posted 24 days ago

Lead the AI product strategy for an enterprise cloud data protection platform, turning real-world customer needs into high-impact, AI-enabled product features and commercial launches.

Mistral AI Hybrid No location specified
Posted 25 days ago

Join Mistral AI as a Model Behavior Architect to shape LLM behavior through prompt design, evaluation pipelines, and policy work informed by humanities expertise.

Photo of the Rise User
Posted 25 days ago

AirOps is hiring a Senior Product Manager to lead the Agents product — designing agent orchestration, evaluation frameworks, and workflows that turn AI insights into publish-ready content at scale.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do model evaluation jobs pay?

Below 50k*
3
7%
50k-100k*
1
2%
Over 100k*
40
91%
*average yearly salary (USD)

Top companies hiring for model evaluation jobs

Best cities to find model evaluation jobs