We're building the first visual, transparent, and interactive deep-research platform.
Our platform is industry-agnostic, but we're starting with finance. Financial services manages $26.5 trillion in assets, yet professionals spend 60-70% of their time manually gathering data from scattered sources. Keru.ai automatically ingests dozens of sources (SEC filings, expert calls, alternative data) and deploys AI agents that conduct deep research while creating interactive decision trees. The agents surface insights and connections that would take analysts hours to find. Analysts explore and modify findings in real-time, then export deliverables like presentations and reports, all with full audit trails showing exactly how conclusions were reached.
Keru.ai was founded by two Palantir veterans with 20 years of combined experience building core parts of Palantir's Gotham and Foundry Platform. Our founders created Palantir Quiver, the analytics engine behind $100M+ enterprise deals with BP and Airbus, architected core compute and data systems, led major Department of Defense projects, and served as Head of Business Engineering at Citadel.
We're backed by founders of OpenAI, Facebook AI, MotherDuck, DBT, and Outerbounds.
As a Data Engineer at Keru.ai, you'll be the architect of the data infrastructure that powers our AI-native research platform. You'll own the pipelines that ingest, transform, and deliver critical financial data, from SEC filings to proprietary vendor feeds. ensuring our platform has the reliable, high-quality data foundation that sophisticated financial research demands.
This role embodies our belief that exceptional AI requires exceptional data. Your pipelines will feed the research workflows of portfolio managers at firms managing billions in assets. Your data quality decisions directly impact million-dollar investment outcomes.
Within your first 90 days, you will:
Own and optimize our SEC data ingestion pipelines end-to-end
Build and maintain integrations with key data vendors
Develop deep expertise in financial data formats, taxonomies, and quality standards
Ship improvements that measurably increase data freshness and reliability
This is the right role if you want to build the data backbone of the future of financial research, with guidance from engineers who've scaled enterprise data platforms from zero to global adoption.
Own critical data pipelines: Design, build, and maintain the pipelines that ingest SEC filings (EDGAR), vendor data feeds, and alternative data sources into our platform.
Ensure data quality at scale: Implement validation, monitoring, and alerting systems that guarantee the accuracy and freshness our clients' research depends on.
Architect for reliability: Build fault-tolerant, self-healing pipelines that handle the unpredictable nature of external data sources and vendor APIs.
Optimize performance: Solve complex challenges around data freshness, processing latency, and storage efficiency for large-scale financial datasets.
Drive data infrastructure innovation: Identify opportunities to expand our data coverage, improve pipeline efficiency, and enhance data accessibility for our AI platform.
Collaborate across teams: Work closely with product and AI engineers to ensure our data infrastructure meets the evolving needs of the platform and our clients.
3–5 years of data engineering experience with a track record of building and maintaining production data pipelines.
ETL/ELT expertise: Deep experience designing and operating data ingestion, transformation, and orchestration systems.
Strong Python skills with experience in data processing frameworks and pipeline orchestration tools (Airflow, Temporal, or similar).
SQL proficiency: Advanced SQL skills and experience with analytical databases.
Data quality mindset: Experience implementing data validation, monitoring, and observability for critical pipelines.
API integration experience: Comfort working with external APIs, handling rate limits, authentication, and unreliable endpoints.
Financial data experience: Familiarity with SEC EDGAR, XBRL, or financial data vendors (Bloomberg, FactSet, S&P, etc.).
Streaming experience: Background with real-time data processing (Kafka, Flink, or similar).
Rust or Node.js experience.
Startup experience where you owned data infrastructure end-to-end.
Cloud infrastructure experience: Hands-on with AWS data services, Kubernetes, or infrastructure-as-code.
Don't check every box? Apply anyway. We prioritize speed of learning, problem-solving skills, attention to detail, and drive to build world-class data infrastructure.
You'll be directly mentored by engineers who built Palantir's core data systems. Expect:
Weekly 1:1s with senior engineers who've architected enterprise-scale data platforms
Deep architectural reviews and guidance on pipeline design
Clear growth path toward technical leadership and data platform ownership
Learn by building—production systems that power real financial research
At Keru.ai, mentorship accelerates strong data engineers into exceptional technical leaders.
Backend: Python, Node.js, Rust, PostgreSQL, Redis
Data Infrastructure: Apache Airflow, Kafka, Temporal, dbt
AI/ML: OpenAI/Anthropic/OpenRouter, Vector Databases
Infrastructure: AWS, Docker, Kubernetes
Monitoring: Datadog
Tools: Git, GitHub Actions, Pulumi
Comprehensive medical, dental, vision, 401k, insurance for employees and dependents
Automatic coverage for basic life, AD&D, and disability insurance
Daily lunch in office
Development environment budget (latest MacBook Pro, multiple monitors, ergonomic setup, and any development tools you need)
Unlimited PTO policy
"Build anything" budget - dedicated funding for whatever tools, libraries, datasets, or infrastructure you need to solve technical challenges, no questions asked
Learning budget - attend any conference, course, or program that makes you better at what we're building
Forward-Deployed with Product DNA: We own customer outcomes, while building a product company. We don't win if our customers don't win. That means embedding, iterating, and deploying where our customers are.
Extreme Ownership: We have a big vision and everyone owns it. If you notice a problem, you own it - diagnose, coordinate, and close the loop. Authority comes from initiative, not job titles, and once you step up, you're accountable for the outcome.
Production-First Engineering: We design for customers' most critical workloads from day one. The platform runs on durable execution paths, blue/green deploys, automated rollbacks, and a continuous-delivery pipeline with end-to-end observability, so every change lands safely and stays resilient under real-world load.
Trust as the Default: We operate on the simple premise that people do their best work when confidence is mutual and earned in the open. That means we show our work, keep our promises, and flag risks before they bite. Automated tests, uptime dashboards, and clear communication back our competence; predictable delivery proves our consistency; candid retros and honest trade-offs reveal our character. Put together, trust isn't an aspiration, it's the baseline everyone can count on.
Keep Raising the Bar: We block time for training, code-health sprints, and deep-dive tech talks, because a sharper team and a cleaner stack pay compounding dividends. Continuous learning isn't a perk, it's part of the job.
Keru.ai is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind. We are committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Amigo is hiring a Senior Software Engineer (Data) to design and operate Databricks streaming and batch pipelines that enable analytics, research, and continuous improvement of clinical AI agents.
NVIDIA seeks a Principal Data Platform Architect to lead architecture and delivery of large-scale observability and data platforms for AI and HPC clusters.
High-impact Data Engineer needed to architect scalable pipelines, integrate external data, and enable model-ready datasets for a fast-growing AI startup in New York.
Artsy is hiring an experienced data leader to define and operationalize a company-wide data strategy, translating analytics into product and revenue growth for its marketplace.
Senior Associate Data Engineer to develop and maintain ETL/Databricks pipelines and integrations that power Rx Savings Solutions' analytics and operational reporting.
Lead the Commercial Real Estate BI function to design scalable analytics, reporting, and visualizations that drive strategic portfolio and executive decisions at Flagstar Bank.
F5 is hiring a seasoned Director of Enterprise Data Management to build enterprise-scale data governance, security, and AI oversight that enables responsible, high-impact analytics.
Design, build, and operate scalable GCP data pipelines and analytics schemas at a fast-growing diagnostics startup powering customer products, analytics, and AI models.
Senior Data Engineer needed to design and maintain complex ETL and backend systems (Python, NodeJS, Java) for Elsevier's clinical AI and healthcare data products.
UC Irvine Libraries is hiring a Digital Initiatives Librarian to manage and grow digital collections and infrastructure while supporting digital scholarship, rights review, and campus outreach.
Join Flyway Health as a founding Data/AI engineer to build and scale AI-driven data pipelines and agent systems that power enterprise life-sciences insights.
Lead enterprise risk analytics strategy and architecture at Lilly to turn cross-functional risk data into executive insights that drive strategic decisions and measurable business outcomes.
Exegy is hiring a Technical Business Analyst in New York to support market data modeling, normalization, and feed integration for its market data engineering team.