Browse 21 exciting jobs hiring in Quantization now. Check out companies hiring such as NVIDIA, Ambient.ai, Knowtex in St. Petersburg, Lexington-Fayette, Cincinnati.
NVIDIA is looking for a Senior Software Engineer (Deep Learning) to build and optimize real-time video AI models and inference stacks for the Maxine and Broadcast platforms.
Ambient.ai is looking for an Applied Research Scientist to design, train, and optimize multimodal foundation models for computer vision and real-time security applications.
Drive next-generation medical speech recognition and clinical NLP at Knowtex as a Staff ML Engineer focused on production-grade models, low-latency inference, and clinical validation.
Samsara is hiring a Senior Machine Learning Engineer to develop and productionize optimized edge ML models that deliver real-time in-vehicle safety and driver experience features across a global fleet.
Lead the zero-to-one design and implementation of a high-throughput, low-latency LLM inference stack as an early engineering hire at an SF-based AI startup.
Senior technical lead for designing and shipping agentic LLM systems that combine advanced context grounding, policy optimization, and low‑latency serving at scale for Gopuff’s Personal Superintelligence Lab.
Drive cutting-edge research and production implementation of agentic LLM systems at Gopuff, focusing on context grounding, alignment, and scalable low-latency inference for personalized shopping intelligence.
Experienced deep learning engineer needed to develop, optimize, and deploy perception DNNs for autonomous vehicle platforms, improving inference speed, accuracy, and efficiency.
Field AI is hiring an Agentic AI/ML Engineer to develop and deploy multimodal, vision-language, and agentic models that power real-world autonomous robots.
Drive extreme-performance LLM inference and industry benchmarking at NVIDIA by optimizing vLLM and MLPerf workloads on cutting-edge NVIDIA GPUs.
Lead technical developer advocacy for NVIDIA’s Physical AI and generative AI platforms, helping partners integrate world foundation models and acceleration technologies into production solutions.
ServiceNow is hiring a Staff Machine Learning Engineer to build and optimize low-latency, high-throughput AI inference systems using Python and Go to deliver LLM-powered features at enterprise scale.
Join Corvus Robotics as a Senior Computer Vision / Machine Learning Engineer to build and optimize 2D/3D perception models for production autonomous inventory-tracking drones.
Lead on-device inference optimization at NVIDIA to deliver high-performance, production-ready AI models for autonomous vehicle and real-time systems.
Lead the architecture and roadmap of Jiffy's core AI platform, driving model development, inference optimization, APIs, and agentic systems to power novel consumer and developer experiences.
NVIDIA is seeking a Senior Deep Learning Software Engineer to develop and optimize perception DNNs for autonomous vehicles and deploy them on NVIDIA GPUs and accelerators.
Specter is hiring an ML Infrastructure Engineer to design and scale training pipelines, optimized model serving, and continuous production workflows for real-time edge perception systems.
Lead high-impact, product-aligned experiments on foundation models using PyTorch and distributed training to improve real-world customer outcomes at Liquid AI.
Lead development and deployment of advanced computer vision and ML solutions for state-of-the-art imaging sensors at a mission-focused defense technology company.
Gimlet Labs is hiring a Software Engineer (AI Performance) to drive model and GPU-level performance improvements for production-scale inference in San Francisco.
BentoML seeks an Inference Optimization Engineer to accelerate LLM inference across GPUs and distributed serving stacks, reducing latency and GPU costs while contributing to open-source tooling.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
18
|