Browse 21 exciting jobs hiring in Model Evaluation now. Check out companies hiring such as Cohere, Prime Time Consulting, Shae Group in Charlotte, New York, Fort Worth.
Cohere is hiring a Member of Technical Staff for Pretraining Evals to design, implement, and improve robust evaluation and statistical pipelines that measure base model capabilities across scales.
Experienced Data Scientist needed to develop and evaluate automated tokenization and POS annotation solutions for speech and text in support of government-focused NLP projects.
Provide fractional CTO-level AI architecture and safety advisory support to Shae Group, guiding model reliability, vendor choices, and design decisions across high-impact AI products.
Lead the design, evaluation, and productionization of machine learning models and signal measurement frameworks to power PayPal's risk solutions and signal marketplace.
Senior Machine Learning Engineer needed to build and deploy scalable, production ML systems that improve healthcare outcomes and operational efficiency.
YouGov seeks a hands-on Data Scientist/AI Engineer to build and deploy LLM-based applications and advanced analytics for market research using survey, census, and behavioral datasets.
Lead development of Arcade’s conversational AI product creation agent as the company’s first dedicated Product Manager for AI, reporting directly to the CEO.
Elsevier is hiring a Senior Data Analyst to lead analytics and evaluation frameworks for generative AI models used in healthcare, ensuring accuracy, safety, and clinical relevance.
Vetcove seeks an AI-focused BAML Engineer to design, implement, and maintain BAML-driven LLM workflows and evaluation tooling for its veterinary software platform.
Oumi seeks a Research Scientist to advance open-source LLM and VLM research by developing models, datasets, benchmarks, and publishing results with the community.
Lead the strategy and delivery of distributed inference, LLM integrations, and on-device ML features at webAI to enable privacy-first, enterprise-grade AI on the edge.
Experienced ML/AI engineer needed to lead development and productionization of LLM- and embedding-based features for Watershed's enterprise sustainability platform.
Handshake AI is hiring a contract Red Teaming Domain Expert to craft adversarial prompts and stress-test LLMs for safety and robustness across real-world edge cases.
Figma is hiring a seasoned Technical Program Manager to drive AI platform programs that scale annotation, evaluation, and model delivery across engineering, research, and product teams.
Join Commure's Ambient Scribe team as a Senior Backend Engineer to build and scale eval and AI infrastructure that powers next-generation clinical AI products.
Lead the architecture and delivery of large-scale, regulated AI systems—driving multi-agent, multi-modal solutions and engineering standards across cross-functional teams.
Lead the next generation of AI-driven ranking and recommendation systems for LinkedIn's Feed to improve relevance, personalization, and member engagement at massive scale.
Lead a small engineering team to build and scale LinkedIn’s HALO model and agent evaluation platform, combining hands-on technical delivery with people and cross-functional leadership.
Lead the design, research, and deployment of novel AI systems at Campus to personalize and measurably improve the student learning experience.
Bond Studio AI is hiring a Staff AI Engineer to design and implement production AI systems and multi-agent LLM architectures that power agentic 3D design experiences for real-world spaces.
Kilo Code seeks a hands-on Solutions Engineer to run high-leverage demos and POCs, bridge sales and engineering, and help shape the company’s pre- and post-sales technical motions.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
20
|