JOB SUMMARY
We’re seeking a Data Engineer to build and optimize data pipelines for AI model training. You’ll work with large datasets, enhance data storage, and improve Python-based workflows. Collaborating closely with ML engineers, you’ll enhance the performance of Python-based data workflows. Ideal candidates have experience with ETL systems, orchestration tools, and multi-terabyte data processing. Familiarity with AWS, Kubernetes, and data lake technologies is a plus. This role is remote, with preference for candidates on the East Coast.
KEY RESPONSIBILITIES
Design and improve data pipelines that process large, multi-modal datasets from a variety of internal and external sources into training datasets for AI models.
Evolve our data storage layer to support analytics, schema evolution, reproducibility, and efficient data access.
Collaborate with ML engineers to improve the performance and reliability of Python-based data processing workflows.
QUALIFICATIONS
Minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or a PhD with 3 years experience; or equivalent experience.
Proven ability to design flexible, maintainable ETL systems.
Experience with data pipeline orchestration tools such as Prefect, Airflow, Argo, Databricks, or Spark.
Understanding of the ML model lifecycle; prior work with scientific or ML workflows is a plus.
Hands-on experience with multi-terabyte scale data processing.
Familiarity with AWS; Kubernetes experience is a bonus.
Knowledge of data lake technologies such as Parquet, Iceberg, AWS Glue etc.
Strong Python software engineering skills.
Pragmatic mindset — able to evaluate tradeoffs find solutions that empower ML researchers to move quickly.
Background in bioinformatics or chemistry is a plus.
ABOUT IAMBIC THERAPEUTICS
Iambic is a clinical-stage life-science and technology company developing novel medicines using its AI-driven discovery and development platform. Based in San Diego and founded in 2020, Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters. The Iambic platform has demonstrated delivery of new drug candidates to human clinical trials with unprecedented speed and across multiple target classes and mechanisms of action. Iambic is advancing a pipeline of potential best-in-class and first-in-class clinical assets, both internally and in partnership, to address urgent unmet patient need. Learn more about the Iambic team, platform, pipeline, and partnerships at iambic.ai.
MISSION & CORE VALUES
Our mission is to deliver better medicines through innovations in AI-based discovery technologies. The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.
PAY AND BENEFITS
We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
A U.S.-based company seeks an Analytics Engineer to deliver Power BI dashboards, SQL analytics, and gold-layer data models while helping scale the analytics platform.
Lead the engineering and scaling of Clay’s Revenue + Cost + Margin Engine, building auditable, AI-native data models and interfaces that power company-wide decisions.
Experienced Data Warehouse Engineer needed to build and maintain AWS data lakes, warehouses, and analytics-ready pipelines to support FinTech analytics and financial models at One Park Financial.
Kentro is hiring a senior AI Data Integration Specialist to architect and implement AI/ML-ready data pipelines and governance for mission-critical VA operations (remote, US ET hours).
Lead architecture and implementation of enterprise-scale lakehouse and cloud data platforms for a mission-driven consulting firm, delivering secure, compliant solutions across Azure and AWS.
Experienced Data Engineer wanted to build and operate scalable marketing data pipelines at KAYAK, leveraging Python, SQL, Airflow, and modern data architecture to enable analytics and experimentation.
Help shape Notion’s GTM data foundation by designing and shipping scalable datasets and pipelines that power marketing, sales, and revenue analytics.
iHerb is seeking an experienced Senior Data Engineer to design and operate cloud-native data platforms and MLOps pipelines that enable production AI/ML at scale.
Kepler AI is hiring a Data Engineer to own and scale the ingestion, transformation, and validation pipelines that power our financial research platform in New York City.
Senior Data Analytics Engineer needed to architect and deliver Snowflake- and AWS-based analytics solutions while partnering directly with clients in a fully remote role.