Browse 9 exciting jobs hiring in Agent Evaluation now. Check out companies hiring such as Replit, Arcade, Decagon in San Diego, New Orleans, Atlanta.
Lead experimentation, trace analysis, and metric design to measure and improve Replit's AI agent, converting agent traces into product-changing insights for engineering and leadership.
Lead development of Arcade’s conversational AI product creation agent as the company’s first dedicated Product Manager for AI, reporting directly to the CEO.
Lead the architecture and long-term evolution of Decagon’s agent orchestration engine to enable reliable, high-performance AI agent behavior at scale.
Lead applied AI product work at Vanta by designing, shipping, and scaling LLM-powered features that accelerate customer compliance and trust.
Lead the architecture and delivery of large-scale, regulated AI systems—driving multi-agent, multi-modal solutions and engineering standards across cross-functional teams.
Bond Studio AI is hiring a Staff AI Engineer to design and implement production AI systems and multi-agent LLM architectures that power agentic 3D design experiences for real-world spaces.
Lead the design and execution of evaluation, reliability, and production-scale testing for Anomali’s agentic AI features that automate SOC workflows and improve analyst productivity.
Help build and ship production AI agents at Sierra as a Software Engineer intern, contributing to the design, implementation, and real-world evaluation of agent features.
Lead the architecture and hands-on implementation of a critical ML subsystem at Basis, shaping how our AI agents think, learn, and operate in production.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|