Browse 22 exciting jobs hiring in Ml Inference now. Check out companies hiring such as Weekday AI, NVIDIA, Alignerr in Denver, Yonkers, Indianapolis.
Lead the design and delivery of edge-native AI infrastructure, building performant, verifiable, and developer-friendly systems for deployment across devices, vehicles, and satellites.
NVIDIA is hiring a Senior System Software Engineer to design and implement high-performance, open-source GPU inference software on the Dynamo team.
Alignerr is hiring a Senior C++ Full-Stack Engineer to build and optimize high-performance C++ systems and full-stack tooling for AI data pipelines and evaluation workflows on a remote contract basis.
Genies is hiring an ML Infra and Model Optimization Engineer to build and optimize scalable inference systems and production ML infrastructure for image and 3D generative models in a hybrid LA/SF role.
Tavern Research is hiring a pragmatic, senior modeler to lead applied modeling and experiment design work that uncovers how narratives and influence spread online.
Work remotely as a contract Senior Python Full-Stack Engineer to build scalable evaluation and data infrastructure powering model training, benchmarking, and quality assurance at Alignerr.
Parafin is hiring a Senior Software Engineer to lead development of its ML platform, enabling data scientists to ship production-quality models for underwriting and other ML products.
NomadicML is hiring a backend/infrastructure engineer to build cloud ingestion, multi-GPU inference pipelines, SDKs, and observability for its video intelligence platform.
WHOOP is hiring a Staff MLOps Platform Engineer to design, build, and operate scalable ML infrastructure that increases reliability, observability, and developer velocity across the company.
Pax Historia seeks a founding ML systems engineer in San Francisco to build production-grade infrastructure, evaluations, and model tuning that make their AI-driven game both higher-quality and more affordable.
Experienced ML engineering leader needed to drive PayPal's global personalization platform, leading teams that build real-time recommendation, ranking, and next-best-action systems at scale.
Work on core ML infrastructure—design and scale distributed training, inference, and cloud-native systems for an early-stage AI backend team based in San Francisco.
Lead development of production-grade ML models for road and lane perception using multi-modal sensors and BEV representations to improve autonomous vehicle navigation and safety.
Alignerr is hiring a Senior C++ Full-Stack Engineer to develop and optimize high-performance C++ systems and end-to-end tooling for AI data pipelines and evaluation workflows on a remote, part-time contract.
Lead WHOOP’s Sensor Intelligence engineering efforts to build and ship optimized embedded ML algorithms that run reliably on wearable devices.
Lead research and implementation of scalable, high-performance LLM inference algorithms and systems at NVIDIA to accelerate agentic AI workloads in datacenter environments.
Lead the development and deployment of road and lane detection ML models for autonomous vehicles, working with multi-modal sensors and production pipelines to improve navigation and safety.
Red Hat is hiring a PhD-level Machine Learning Systems Research Intern to advance model optimization and efficient inference techniques for open-source LLMs and vLLM.
Lead Etched's developer experience by defining documentation, SDKs, and workflows that help customers go from unboxing to production on our hardware-software inference platform.
Palo Alto Networks is hiring a Senior Staff AI Engineer to lead design and delivery of enterprise-grade AI/ML solutions and platform capabilities across the organization.
A hands-on ML/AI Engineer role to architect and productionize hybrid ML and LLM-driven systems that extract structured workflow understanding from noisy enterprise data at scale.
Gridware seeks a Senior Applied ML Scientist (ML + DSP) to develop and optimize resource-constrained, edge-deployable models for multimodal grid sensor time-series data.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|