Basis equips accountants with a team of AI agents to take on real workflows.
We have hit product-market fit, have more demand than we can meet, and just raised $34m to scale at a speed that meets this moment.
Built in New York City. Read more about Basis here.

We build the ML systems that power Basis's AI Accountant. Our systems read documents, reason over context, and complete real accounting workflows safely and accurately.
We focus on the whole system, not just the model. We optimize everything around it: tools, memory, retrieval, orchestration, and evaluation. We push model providers to their limits when needed (custom runtimes, unusual packages, unconventional loops) and run experiments to learn quickly.
We work in small, focused pods alongside Platform, Product, and Accounting experts. We think in systems, debate trade-offs, and write code that's observable, understandable, and built for continuous learning in production.
As an ML Engineer at Basis, you'll own end-to-end projects that bring intelligence into production. You'll be the Responsible Party (RP) for systems that help our agents reason, plan, and evaluate themselves. That means you'll scope, build, and deliver from first principles.
You'll have full autonomy: plan your projects, define success, run experiments, and decide when your system is ready to ship.
You'll move fast, instrument everything, and design for clarity. You'll build the scaffolding that lets models act safely and improve continuously.
This is a role for engineers who want to be both researchers and builders: reasoning through problems, experimenting with solutions, and shipping systems that get smarter over time.
Build and evolve our agent systems
Design and iterate multi-agent architectures that automate real accounting workflows.
Build in autonomy boundaries, tool usage, and fallback behaviors that make agents safe and reliable.
Manage context and memory for coherence across steps. Plan and execute agent loops with measurable success criteria.
Route, evaluate, and optimize models under real-world constraints (latency, cost, accuracy).
Design evaluation and experimentation frameworks
Build scalable evaluation pipelines (offline and online) that run hundreds of experiments automatically.
Define golden tasks, labeling strategies, and metrics that make performance measurable and comparable.
Instrument the stack to detect regressions, track error patterns, and drive continuous improvement.
Use data and experiments to drive product and architectural decisions, not just intuition.
Engineer for context and retrieval
Build prompt stacks and instruction hierarchies that structure model reasoning.
Create retrieval and indexing pipelines that surface relevant context efficiently.
Parse messy documents into structured representations that agents can understand.
Design guardrails and validation layers to keep behavior safe and predictable.
Operate as an RP: plan, build, deliver
Scope your projects clearly. Write concise specs and architecture docs that eliminate ambiguity.
Build, test, and instrument your systems end-to-end.
Communicate progress clearly: what's built, what's learned, what's next.
Work closely with your pod, teaching, unblocking, and sharing learnings as you go.
📍 Location: NYC, Flatiron office. In-person team.
You scope, execute, and deliver your systems from concept to production.
You instrument everything, measure outcomes, and learn from data.
You design clean abstractions for complex ML systems that others can build on.
Your work makes the whole team faster and better through clear interfaces and insights.
You move fast, stay curious, and build with conviction and care.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Join Basis’s Agent Data team in NYC to design and own production data pipelines, schemas, and observability that power AI agents for accounting workflows.
Basis is hiring Platform Engineers in NYC to own and deliver scalable infrastructure, data pipelines, and domain models that power AI-driven accounting workflows.
Walmart Business seeks a Principal Data Scientist to lead causal inference and experimentation efforts that inform high-stakes product and business decisions.
Lead and scale a machine learning engineering team to build production fraud and risk models in a remote-friendly, high-growth environment based in Texas.
Lead and grow a statistical programming team at AbbVie to deliver ADaM datasets, TLFs and regulatory documentation for clinical development programs.