Rise Jobs & Careers icon Llm Evaluation Jobs

Browse 26 exciting jobs hiring in Llm Evaluation now. Check out companies hiring such as Welocalize, WHOOP, Flock Safety in Charlotte, Laredo, Madison.

Photo of the Rise User

Welo Data is seeking native English annotators in the U.S. to produce high-quality ground truth and evaluate model outputs for personalized music, podcast, and audiobook experiences.

Photo of the Rise User
Posted 2 days ago

WHOOP is hiring a Senior AI/ML Engineer to design, build, and operate production AI systems and LLM tooling that power personalized, member-facing experiences like WHOOP Coach and AI Support.

Photo of the Rise User
Flock Safety Hybrid No location specified
Posted 5 days ago
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Learning & Development
Equity
Paid Holidays
Paid Time-Off
WFH Reimbursements
Child Care stipend
Maternity Leave
Paternity Leave

Lead the design and productionization of agentic AI systems and an evaluation platform to power Night Shift, Flock Safety’s investigator-facing LLM agent product.

Photo of the Rise User

Senior Machine Learning Engineer needed to build and deploy scalable, production ML systems that improve healthcare outcomes and operational efficiency.

Photo of the Rise User
Anduril Industries Hybrid Costa Mesa, California, United States
Posted 7 days ago

Anduril's Thunderforge team is hiring a Prompt Engineering Intern to develop prompts, agent graph architectures, and test/evaluation tooling for AI-enabled wargaming.

Photo of the Rise User
Posted 9 days ago
Inclusive & Diverse
Mission Driven
Work/Life Harmony
Diversity of Opinions
Friends Outside of Work
Empathetic
Collaboration over Competition
Fast-Paced
Transparent & Candid
Medical Insurance
Dental Insurance
Vision Insurance
Disability Insurance
Learning & Development
401K Matching
Paid Time-Off
WFH Reimbursements
Paid Holidays
Equity
Flex-Friendly

Lead experimentation, trace analysis, and metric design to measure and improve Replit's AI agent, converting agent traces into product-changing insights for engineering and leadership.

Photo of the Rise User
YouGov Hybrid New York, United States of America
Posted 9 days ago

YouGov seeks a hands-on Data Scientist/AI Engineer to build and deploy LLM-based applications and advanced analytics for market research using survey, census, and behavioral datasets.

Photo of the Rise User
Posted 9 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony

Lead the design and delivery of Zapier’s unified AI platform as a Staff Applied AI Engineer, shaping runtime, orchestration, and evaluation systems that power the company’s AI products.

Photo of the Rise User
Posted 9 days ago

Jump seeks a US-based QA Engineer to own AI evaluation, labeling campaigns, and QA processes that improve generative AI outputs for our meeting assistant product.

Photo of the Rise User

Lead product strategy and execution for context, memory, and retrieval systems that power MagicSchool’s AI agents to deliver reliable, educator-focused assistance at scale.

Photo of the Rise User
Mursion, Inc Hybrid No location specified
Posted 10 days ago

Mursion is hiring a Prompt Engineer to craft production-grade LLM prompts, manage RAG/JSON workflows, and translate learning objectives into reliable AI-driven simulation behavior.

Posted 11 days ago

Sony AI's Research Ethics team is hiring a remote Engineering Intern (AI Ethics) to help build agentic AI infrastructure, run LLM evaluations, and develop tools for responsible AI in a research-driven environment.

Photo of the Rise User
Posted 11 days ago

Lead development of Arcade’s conversational AI product creation agent as the company’s first dedicated Product Manager for AI, reporting directly to the CEO.

Photo of the Rise User
Posted 18 days ago
Mission Driven
Inclusive & Diverse
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Vision Insurance
401K Matching
Flex-Friendly
Equity

Vetcove seeks an AI-focused BAML Engineer to design, implement, and maintain BAML-driven LLM workflows and evaluation tooling for its veterinary software platform.

Posted 18 days ago

Help developers adopt Judgment Labs' SDK and evaluation tools by building docs, demos, and sample agent setups as a Developer Relations Engineer in San Francisco.

Posted 18 days ago

Be part of a San Francisco-based venture-backed team as a Technical Writer crafting deep technical content on agent evaluation, monitoring, and reward modeling for a technical audience.

Posted 19 days ago

Oumi seeks a Research Scientist to advance open-source LLM and VLM research by developing models, datasets, benchmarks, and publishing results with the community.

Photo of the Rise User
Posted 19 days ago

Lead the strategy and delivery of distributed inference, LLM integrations, and on-device ML features at webAI to enable privacy-first, enterprise-grade AI on the edge.

Photo of the Rise User
Posted 22 days ago
Inclusive & Diverse
Growth & Learning
Customer-Centric
Collaboration over Competition
Medical Insurance
Maternity Leave
Flex-Friendly
401K Matching

Lead design and implementation of scalable AI infrastructure and developer tooling to accelerate Vanta’s AI-powered product initiatives.

Photo of the Rise User
Posted 22 days ago
Inclusive & Diverse
Growth & Learning
Customer-Centric
Collaboration over Competition
Medical Insurance
Maternity Leave
Flex-Friendly
401K Matching

Lead applied AI product work at Vanta by designing, shipping, and scaling LLM-powered features that accelerate customer compliance and trust.

Photo of the Rise User

An AI engineering role focused on building and improving voice-first and omnichannel credit-servicing agents using Python and integrated language models at an early-stage fintech startup.

Photo of the Rise User

Join Commure's Ambient Scribe team as a Senior Backend Engineer to build and scale eval and AI infrastructure that powers next-generation clinical AI products.

Photo of the Rise User
Posted 27 days ago

Bond Studio AI is hiring a Staff AI Engineer to design and implement production AI systems and multi-agent LLM architectures that power agentic 3D design experiences for real-world spaces.

Kilo Code Hybrid No location specified
Posted 28 days ago

Kilo Code seeks a hands-on Solutions Engineer to run high-leverage demos and POCs, bridge sales and engineering, and help shape the company’s pre- and post-sales technical motions.

MLabs Hybrid No location specified
Posted 29 days ago

MLabs seeks an Applied AI Engineer to build and ship LLM-powered production systems that transform healthcare and life-sciences workflows.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Vision Insurance
Family Medical Leave
Paid Holidays

Lead the design and execution of evaluation, reliability, and production-scale testing for Anomali’s agentic AI features that automate SOC workflows and improve analyst productivity.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do llm evaluation jobs pay?

Below 50k*
0
0%
50k-100k*
1
100%
Over 100k*
0
0%
*average yearly salary (USD)

Top companies hiring for llm evaluation jobs

Best cities to find llm evaluation jobs