Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
AI Engineer, Agents & Evaluation image - Rise Careers
Job details

AI Engineer, Agents & Evaluation

We’re looking for our first AI Engineer focused on agents and evaluation—a foundational hire who will shape how we build, measure, and scale intelligent systems.

The Opportunity: Design the Playbook for High-Performance AI Agents

We’re tackling one of the hardest—and most important—problems in software engineering: helping developers understand, evolve, and operate complex systems using autonomous and event-driven AI.

In this role, you’ll build the evaluation frameworks, task harnesses, and orchestration strategies that make our agents reliable, testable, and genuinely useful. Your work will not only directly improve our agents—it will create reusable benchmarks and artifacts that can inspire new approaches and push forward the broader foundation model ecosystem.

If you love designing experiments, building systems, and iterating tightly between theory and code—and you’re excited by a very 0→1, research-engineering style role—this is for you.

What You Will Do

  • Create Task Evaluations That Matter: Design and implement task-specific evaluations that measure and improve agent quality. Each eval should both drive concrete iteration on our agents and spark broader innovation around the task itself.

  • Define Tasks, Datasets, and Harnesses: Clearly specify tasks, collect and curate balanced datasets, and build robust evaluation harnesses that can be used across agents and modeling approaches. There is ample room for architectural design and systems thinking here.

  • Build and Use a Reusable Evaluation Framework: Develop frameworks and tools for running evaluations at scale. Use these frameworks to tune existing agents and to guide the development of new ones in our environment.

  • Explore Agent Orchestration Strategies: Investigate and implement orchestration patterns (tooling, routing, decomposition, multi-agent setups, etc.) that allow agents to tackle increasingly complex, multi-step, and long-horizon tasks.

  • Apply Post-Training Techniques: Experiment with post-training approaches (e.g., fine-tuning, preference optimization, reward shaping, distillation) to produce high-performance models tailored to specific tasks and workflows.

  • Run Experiments End-to-End: Design, run, and analyze experiments with rigor. Turn experimental results into clear recommendations and concrete changes to model configurations, prompts, and system design.

  • Collaborate Deeply Across the Stack: Work closely with founders, product, and infrastructure engineers to ensure evaluations, agents, and platform primitives all reinforce each other.

What You Will Bring

  • MS or Ph.D. in a relevant field (e.g., Computer Science, Machine Learning, NLP) or equivalent practical experience

  • Strong background in machine learning and large language models, ideally including both research and hands-on implementation

  • 2–5 years working with LLM technology, with familiarity across:

    • Prompting and interaction patterns

    • Agent and tool orchestration strategies

    • Evaluation strategies for complex, open-ended tasks

  • Proficiency writing production-quality code, especially in Python; comfort working with TypeScript or modern web/backend stacks

  • Experience designing and running experiments, and interpreting results in messy, real-world settings

  • Self-motivated, comfortable operating in an unstructured, high-ambiguity environment

  • Strong communication skills and the ability to translate vague goals into concrete, testable setups

Bonus Points

  • Experience building agentic systems (tool-using agents, workflows, or multi-agent systems) in real products

  • Prior work on model evaluation frameworks, benchmarking, or reliability/robustness testing

  • Familiarity with modern ML tooling (training/inference stacks, experiment tracking, data pipelines)

  • Contributions to open-source LLM, tooling, or evaluation projects

  • Experience at an early-stage startup or research lab where you owned projects end-to-end

Benefits & Perks

  • Significant equity in an early-stage, venture-backed startup

  • Comprehensive Health Benefits (Medical, Dental, Vision)

  • Flexible PTO to ensure you have the time you need to recharge

Awesome Motive Glassdoor Company Review
4.2 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
Awesome Motive DE&I Review
4.4 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
CEO of Awesome Motive
Awesome Motive CEO photo
Kartik Mandaville
Approve of CEO

Average salary estimate

$180000 / YEARLY (est.)
min
max
$150000K
$210000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Awesome Motive logo

What it's like to work at Awesome Motive

Read Reviews
Similar Jobs
Photo of the Rise User
Awesome Motive Hybrid No location specified
Posted 23 hours ago

Boam AI is looking for a technical, execution-focused Product Lead to own delivery of data- and AI-driven products, working directly with the CEO and customer-facing teams.

Photo of the Rise User
Posted 15 hours ago

Mia is hiring a Client Success Manager to lead technical onboarding, drive adoption, and manage long-term customer relationships for its AI-powered dealer communications platform.

SpringRole is the first professional reputation network powered by artificial intelligence and blockchain to eliminate fraud from user profiles. Because SpringRole is built on blockchain and uses smart contracts, it's able to verify work experienc...

206 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 16, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!