Browse 29 exciting jobs hiring in Inference Engineer now. Check out companies hiring such as Ataraxis AI, Awesome Motive, FM in Little Rock, Columbus, Raleigh.
Contribute to cutting-edge AI research in oncology by building, optimizing, and deploying scalable machine learning models and evaluation frameworks at Ataraxis AI.
Tamarind Bio is hiring an AI/LLM Engineer in San Francisco to build scalable, production-grade workflows and enhance an ML copilot for computational biology.
Help architect and scale a production ML inference platform at Tamarind Bio to serve hundreds of biological models and support rapid customer growth.
Lead high-impact, product-aligned experiments on foundation models using PyTorch and distributed training to improve real-world customer outcomes at Liquid AI.
Tinder is hiring a Software Engineer III, Machine Learning (Engagement & Growth) to develop and scale personalization and recommendation models that optimize notifications, CRM, and user retention.
Sable is hiring an AI Engineer in San Francisco to develop and productionize multimodal deep-learning systems that power digital-human enterprise workers.
Gimlet Labs is hiring a Software Engineer (AI Performance) to drive model and GPU-level performance improvements for production-scale inference in San Francisco.
BentoML seeks an Inference Optimization Engineer to accelerate LLM inference across GPUs and distributed serving stacks, reducing latency and GPU costs while contributing to open-source tooling.
Drive the design and production deployment of secure, agent-based AI systems at 3E to deliver high-impact, customer-facing intelligence using LLMs and modern orchestration frameworks.
Intern with Arcade's backend and AI engineering team to build scalable model orchestration, inference, and production backend systems for generative product creation.
Bjak seeks an MLOps Engineer to run and scale open-source LLMs into production, optimizing for cost, latency, and reliability while working in a flexible hybrid model.
NVIDIA is seeking an experienced Embedded Field Applications Engineer to support customers building AI-enabled embedded systems on the Jetson platform across the NALA region.
Woven by Toyota is hiring a Perception Lead Engineer to lead ML model development and deployment for production autonomous vehicle perception systems.
Experienced SRE with distributed systems and LLM experience needed to design and operate scalable, reliable managed AI services for a mission-driven, sustainability-focused AI infrastructure company.
Contribute to Adobe Firefly’s GenAI Services by building optimized inference pipelines, integrating generative models into flagship products, and developing scalable ML systems for production.
Tessera Labs seeks an AI Agent Engineer to design and deploy scalable, secure multi-step agentic pipelines using LLMs and tool-calling integrations for enterprise applications.
Plaid is hiring a Senior Software Engineer to architect, build, and operate scalable ML infrastructure—feature stores, pipelines, deployment and inference tooling—to accelerate trustworthy AI across the company.
Build and scale mission-critical ML systems at TwelveLabs to power state-of-the-art multimodal video understanding models.
Lead the design and production of large-scale, transformer-based and multi-task ML systems on Spotify's Ads R&D team to improve ad targeting, user experience, and business metrics.
Ironclad is looking for a Staff Software Engineer - Applied AI to build and productionize LLMs, RAG systems, and document-understanding services that deliver actionable contract insights.
An experienced systems and ML-inference engineer is needed to lead development of low-latency, high-throughput inference pipelines spanning on-device and cluster deployments.
Join Additive as a Member of Technical Staff — Backend to design and build core backend systems that integrate ML inference, monitoring, and business logic for tax-focused applications.
Deepgram is hiring a Backend Engineer to build and optimize high-performance inference services for speech processing, orchestration, and low-latency audio pipelines.
Work in-person with a tight-knit, mission-driven team in San Francisco to build the next-generation AI hiring platform that runs real-time interview conversations and scales to thousands of users.
Lead the personalization strategy and architecture at Launch Potato as Principal ML Engineer, building production-grade recommender systems and ML platforms that drive measurable business impact.
Launch Potato seeks a Principal Machine Learning Engineer to establish the technical vision for personalization and build ML platforms that drive billions of predictions across its digital media brands.
Lead Launch Potato’s personalization vision as a Principal ML Engineer, architecting and shipping large-scale recommender systems that serve billions of predictions and directly impact business KPIs.
Lead Launch Potato's personalization strategy and architect large-scale recommender systems to deliver billions of predictions and measurable business impact.
Help build the next generation of adaptive robotics software as a mid-level generalist engineer working on simulation and robot control at an early-stage, VC-backed startup.
Below 50k*
0
|
50k-100k*
1
|
Over 100k*
0
|