Job details

Member of Technical Staff (All Levels) - Applied ML

About Basis

Basis equips accountants with a team of AI agents to take on real workflows.

We have hit product-market fit, have more demand than we can meet, and just raised $34m to scale at a speed that meets this moment.

Built in New York City. Read more about Basis here.

About the Team

We build the ML systems that power Basis's AI Accountant. Our systems read documents, reason over context, and complete real accounting workflows safely and accurately.

We focus on the whole system, not just the model. We optimize everything around it: tools, memory, retrieval, orchestration, and evaluation. We push model providers to their limits when needed (custom runtimes, unusual packages, unconventional loops) and run experiments to learn quickly.

We work in small, focused pods alongside Platform, Product, and Accounting experts. We think in systems, debate trade-offs, and write code that's observable, understandable, and built for continuous learning in production.

About the Role

As an ML Engineer at Basis, you'll own end-to-end projects that bring intelligence into production. You'll be the Responsible Party (RP) for systems that help our agents reason, plan, and evaluate themselves. That means you'll scope, build, and deliver from first principles.

You'll have full autonomy: plan your projects, define success, run experiments, and decide when your system is ready to ship.

You'll move fast, instrument everything, and design for clarity. You'll build the scaffolding that lets models act safely and improve continuously.

This is a role for engineers who want to be both researchers and builders: reasoning through problems, experimenting with solutions, and shipping systems that get smarter over time.

What you’ll be doing:

Build and evolve our agent systems

Design and iterate multi-agent architectures that automate real accounting workflows.
Build in autonomy boundaries, tool usage, and fallback behaviors that make agents safe and reliable.
Manage context and memory for coherence across steps. Plan and execute agent loops with measurable success criteria.
Route, evaluate, and optimize models under real-world constraints (latency, cost, accuracy).

Design evaluation and experimentation frameworks

Build scalable evaluation pipelines (offline and online) that run hundreds of experiments automatically.
Define golden tasks, labeling strategies, and metrics that make performance measurable and comparable.
Instrument the stack to detect regressions, track error patterns, and drive continuous improvement.
Use data and experiments to drive product and architectural decisions, not just intuition.

Engineer for context and retrieval

Build prompt stacks and instruction hierarchies that structure model reasoning.
Create retrieval and indexing pipelines that surface relevant context efficiently.
Parse messy documents into structured representations that agents can understand.
Design guardrails and validation layers to keep behavior safe and predictable.

Operate as an RP: plan, build, deliver

Scope your projects clearly. Write concise specs and architecture docs that eliminate ambiguity.
Build, test, and instrument your systems end-to-end.
Communicate progress clearly: what's built, what's learned, what's next.
Work closely with your pod, teaching, unblocking, and sharing learnings as you go.

📍 Location: NYC, Flatiron office. In-person team.

What Success looks like in this role

You scope, execute, and deliver your systems from concept to production.
You instrument everything, measure outcomes, and learn from data.
You design clean abstractions for complex ML systems that others can build on.
Your work makes the whole team faster and better through clear interfaces and insights.
You move fast, stay curious, and build with conviction and care.

ML Engineer Applied ML ML Systems LLM Agents Prompt Engineering Retrieval Vector DB PyTorch Evaluation Instrumentation NLP Document Parsing Production ML Experimentation

Average salary estimate

$200000 / YEARLY (est.)

min

max

$140000K

$260000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Member of Technical Staff (All Levels) - Agent Data

Basis AI Hybrid New York

VIEW

Posted 3 hours ago

Join Basis’s Agent Data team in NYC to design and own production data pipelines, schemas, and observability that power AI agents for accounting workflows.

Member of Technical Staff (All Levels) - Platform

Basis AI Hybrid New York

VIEW

Posted 2 hours ago

Basis is hiring Platform Engineers in NYC to own and deliver scalable infrastructure, data pipelines, and domain models that power AI-driven accounting workflows.

Principal, Data Scientist - Causal Inference & Experimentation

Walmart Hybrid Bentonville, AR

VIEW

Posted 13 hours ago

Inclusive & Diverse

Rise from Within

Mission Driven

Diversity of Opinions

Work/Life Harmony

Take Risks

Casual Dress Code

Startup Mindset

Emails over Meetings

Collaboration over Competition

Fast-Paced

Growth & Learning

Open Door Policy

Customer-Centric

Social Impact Driven

Passion for Exploration

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Conferences Stipend

Education Stipend

Learning & Development

Bias Training

Paid Time-Off

Maternity Leave

Equity

Work Visa Sponsorship

Walmart Business seeks a Principal Data Scientist to lead causal inference and experimentation efforts that inform high-stakes product and business decisions.

Machine Learning Engineering Manager

Jobgether Hybrid Texas

VIEW

Posted 12 hours ago

Lead and scale a machine learning engineering team to build production fraud and risk models in a remote-friendly, high-growth environment based in Texas.