Rise Jobs & Careers icon Ai Evaluation Jobs

Browse 46 exciting jobs hiring in Ai Evaluation now. Check out companies hiring such as Weekday AI, Compa, BAE Systems in Akron, Milwaukee, St. Paul.

Posted 7 hours ago

Experienced wet-lab biology PhDs are needed to assess and annotate experimental failure modes and recommend mitigations for an AI research benchmark.

Compa Hybrid No location specified
Posted 15 hours ago

Lead and grow Compa’s inaugural Applied AI team, driving production ML systems and MLOps practices to power enterprise compensation intelligence.

Photo of the Rise User
Posted 2 days ago

Lead the engineering and applied-LLM work to improve agent reliability, autonomy, and evaluation pipelines for a fast-moving startup building autonomous business agents.

MobilityWorks Regular Full-Time NILES, Illinois
Sponsored
USAA Full-Time PHOENIX, Arizona
Sponsored
Photo of the Rise User
Posted 2 days ago

Lead the AI product strategy for an enterprise cloud data protection platform, turning real-world customer needs into high-impact, AI-enabled product features and commercial launches.

Posted 3 days ago

Lead the design and delivery of agent-based AI and orchestration frameworks at Heidi to safely automate clinician workflows and scale clinical impact.

Photo of the Rise User
Posted 3 days ago

AirOps is hiring a Senior Product Manager to lead the Agents product — designing agent orchestration, evaluation frameworks, and workflows that turn AI insights into publish-ready content at scale.

Photo of the Rise User

Adtalem is seeking a Senior Analyst, Market Intelligence & Insights to lead always-on research and translate AI and edtech competitive intelligence into actionable insights and executive briefings for enterprise AI strategy.

Photo of the Rise User
Posted 8 days ago
Customer-Centric
Inclusive & Diverse
Transparent & Candid

Siena is hiring a Product Engineer to build full-stack AI-driven agent capabilities, shape evaluation systems, and deliver integrations that redefine customer experience and e-commerce.

Photo of the Rise User
Inclusive & Diverse
Customer-Centric
Mission Driven
Fast-Paced
Growth & Learning
Transparent & Candid
Diversity of Opinions
Work/Life Harmony
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Learning & Development
Fitness Stipend
401K Matching
Equity
Life insurance
Disability Insurance
WFH Reimbursements
Flex-Friendly
Paid Time-Off
Maternity Leave
Paternity Leave
Paid Holidays
Paid Volunteer Time
Sabbatical

Zillow's Agentic AI team is hiring a Machine Learning Engineer to design, train, evaluate, and ship agentic LLM solutions that improve user understanding and decision-making across the home search experience.

RWS is seeking part-time remote AI Data Specialists in Florida to perform data annotation, evaluation, and tagging tasks that improve AI content quality and safety.

Join Ataraxis AI as a Research Engineer (Data Science) to advance AI-driven precision oncology through rigorous data pipelines, reproducible research, and publication-grade scientific contributions.

Photo of the Rise User
Posted 12 days ago

Lead the development of agentic LLM systems and domain-specific fine-tuning at Argon to build the next-generation AI OS for pharma from our NYC office.

Photo of the Rise User

Amigo is seeking an Applied Scientist to develop evaluation and safety frameworks that ensure AI systems are reliable and safe for healthcare deployment.

Beyondsoft Consulting Hybrid United States (Remote)
Posted 15 days ago

Beyondsoft is hiring a Data Analyst to prepare training data, anonymize documents, and validate LLM/model outputs for AI projects in a remote US-based role.

Photo of the Rise User
TekSynap Hybrid National Capital Region
Posted 15 days ago

Lead validation and automated assurance for agentic AI systems supporting NGA missions, focusing on benchmark design, regression testing, and CI/CD-integrated verification.

Braintrust Hybrid No location specified
Posted 16 days ago

Design and deliver developer-focused curriculum and hands-on programs that teach evals and agentic AI at Braintrust, working closely with engineers and product teams.

Photo of the Rise User
Posted 18 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Customer-Centric
Fast-Paced
Growth & Learning
Medical Insurance
Dental Insurance
401K Matching
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Flex-Friendly

Lead the architecture and delivery of generative AI and multimodal systems that enable creative and contextual advertising capabilities across Netflix Ads.

Photo of the Rise User
Posted 19 days ago

WeRide seeks an AI Simulation Engineer to design AI-based simulation scenarios and agent behaviors that validate and accelerate autonomous vehicle algorithms.

Photo of the Rise User
Posted 19 days ago
Inclusive & Diverse
Diversity of Opinions
Passion for Exploration
Dare to be Different
Empathetic
Growth & Learning
Paid Holidays
Medical Insurance
Equity
401K Matching
Learning & Development
Social Gatherings
Flex-Friendly
Maternity Leave
Paternity Leave
Sabbatical

Canva is hiring a Senior Research Engineer to engineer agentic, multimodal evaluation systems that automatically assess and improve the quality and human alignment of generative design models.

MIRI, a nonprofit focused on reducing existential AI risk, is hiring a Technical Governance Team Manager to lead stakeholder engagement, run projects and people processes, and help produce rigorous technical governance research.

Posted 20 days ago

Eigenplane is hiring a Founding AI Research Scientist to drive LLM and agent research into scalable, interpretable production systems at an early-stage AI startup.

Photo of the Rise User
Posted 20 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony

Lead the technical direction and hands-on engineering for Zapier Agents, building production-grade LLM-driven agent capabilities, integrations, and evaluation systems that scale across thousands of apps and real customers.

Photo of the Rise User
Posted 20 days ago

Decagon seeks an experienced QA Lead in San Francisco to build and run QA for AI-powered customer service agents, moving from hands-on evaluation to scalable QA processes and team leadership.

Photo of the Rise User

Apply your SEC-filings and financial-analysis expertise remotely at Welo Data to evaluate AI-generated outputs from 10-K filings in a short-term contract role with potential extension.

Photo of the Rise User
Posted 21 days ago

Ironclad is looking for a Staff Software Engineer - Applied AI to build and productionize LLMs, RAG systems, and document-understanding services that deliver actionable contract insights.

Photo of the Rise User

LILT is hiring native Mandarin/Simplified Chinese linguists to perform remote prompt evaluation, multimedia content understanding, and text review for an AI-driven translation project.

Posted 21 days ago

DepthFirst AI is hiring a Research Engineer to develop and evaluate AI agents and training pipelines that discover and exploit software vulnerabilities at scale.

Unstructured seeks an experienced AI/ML Engineer to design, evaluate, and deploy secure ML solutions for Department of Defense and national security customers on government networks.

MLabs Hybrid No location specified
Posted 21 days ago

Work from the SoHo NYC office as an Applied AI Engineer building production LLMs and ML systems that accelerate bringing new therapies to market.

Tessera Labs seeks a Machine Learning Engineer Intern (Fall 2025, Hybrid in San Jose) to build and fine-tune LLM-driven multi-agent pipelines and enterprise tool integrations.

MobilityWorks Regular Full-Time NEWARK, Ohio
Sponsored
Posted 22 days ago

Lead the design and evaluation of long-term memory systems for LLMs at an early-stage AI startup focused on building self-improving agents.

Weekday AI Hybrid No location specified
Posted 22 days ago

Work with a top AI research lab to evaluate and improve LLM performance on advanced economics tasks by providing expert, written feedback.

Weekday AI Hybrid No location specified
Posted 22 days ago

Help shape next-generation AI by evaluating advanced physics solutions and guiding research teams to improve model performance as a contract Physics AI Trainer.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake AI is hiring a Technical Program Manager, AI Operations to run high-impact AI data programs, ensuring scalable processes, data quality, and excellent customer outcomes.

Posted 23 days ago

Lead the design and deployment of cutting-edge 3D computer vision and generative ML models at Dandy to automate and improve dental manufacturing workflows.

Oura Hybrid No location specified
Posted 24 days ago

Oura is hiring a Senior AI Engineer to design evaluation systems and build custom LLM and agentic models that power next-generation, actionable health recommendations.

Posted 24 days ago

A 12-month AI Fellowship at the Gates Foundation to design, prototype, and deploy responsible AI solutions for global health and development while building capacity across program teams.

Photo of the Rise User

Work with Khan Academy to design and deploy generative AI features that improve literacy learning in a 24-month fixed-term Senior AI Engineer role.

Photo of the Rise User
Posted 26 days ago

Technical leader needed to architect and deliver complex, production C++ systems integrating sensors, hardware, and software for high-stakes intelligence programs in Reston, VA.

Posted 26 days ago

Lead product and context engineering efforts to improve LLM-driven AI agent performance and user experience for advice-focused client intents within Vanguard's Discretionary Advice Platform.

Sponsored
Photo of the Rise User

Decagon seeks an Agent Software Engineer intern to build and evaluate production-ready conversational AI agents that improve customer support, working onsite in San Francisco during Summer 2026.

Profound Hybrid New York City
Posted 28 days ago

Profound, an NYC AI startup backed by Sequoia, is hiring an AI/ML Engineer to build production-scale NLP and LLM systems for content classification, generation, and measurement.

Posted 28 days ago

Work at the intersection of research and deployment to turn Twelve Labs’ video understanding models into scalable, production solutions for customers.

Photo of the Rise User

Build and ship mission-critical conversational AI agents at Decagon, working directly with enterprise customers to create scalable, high-impact solutions.

Posted 29 days ago

Dandy is hiring a Senior Machine Learning Engineer to advance 3D computer vision and generative ML models that automate and scale dental appliance manufacturing.

Photo of the Rise User
NBCUniversal Hybrid 30 Rockefeller Plaza, New York, NEW YORK
Posted 29 days ago

NBCUniversal is hiring an Analyst, AI Strategy & Innovation to perform market and vendor analysis, build financial/business cases, and support cross-functional pilots and innovation programs across its media businesses.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do ai evaluation jobs pay?

Below 50k*
0
0%
50k-100k*
0
0%
Over 100k*
1
100%
*average yearly salary (USD)

Top companies hiring for ai evaluation jobs

Best cities to find ai evaluation jobs