Browse 68 exciting jobs hiring in Inference now. Check out companies hiring such as Altarum, NVIDIA, Fuku in Oakland, Richmond, Laredo.
Lead the design and delivery of secure, scalable, cloud-agnostic ML platform infrastructure and pipelines to enable reliable, explainable AI and analytics for public health at Altarum.
NVIDIA is hiring a Software Engineer, ML to optimize state-of-the-art ML training and inference across GPU hardware and software stacks.
High-impact ML Engineer role building search, ranking, and personalization systems for a fast-growing consumer shopping platform with strong user retention and competitive equity.
Visa is hiring a Senior Director to lead global brand strength measurement, unifying perceptual and behavioral signals and building AI-driven insight systems that inform enterprise decisions.
Lead the architecture and engineering of Amigo's backend platform to scale real-time inference, multi-LLM orchestration, and secure EHR integrations across millions of conversations.
Lead the product vision and roadmap for Haus’s Attribution solutions, building the measurement engine marketers open each morning to drive decisions.
Experienced data scientist sought to lead product growth measurement, experimentation, and analytics at Khan Academy to improve student engagement and outcomes on a remote, 24-month fixed-term basis.
Lead platform-sourced incrementality experiments and partner with advertisers and ad platforms to turn measurement insights into scalable business outcomes for Haus.
Serve as the technical bridge between PEPR AI’s autonomous decision engine and client systems, owning integrations, experiments, and senior stakeholder relationships to drive measurable business outcomes.
NVIDIA is hiring a Senior Solutions Architect to architect and deliver AI-accelerated CDN and telco solutions that integrate GPUs, edge inference, Kubernetes, and CDN platforms for low-latency, scalable deployments.
Lead the architecture and hands-on development of Sciforium’s high-performance model serving platform, spanning GPU kernels, runtimes, distributed scheduling, and Python APIs to deliver low-latency multimodal inference.
Lead experimentation, trace analysis, and metric design to measure and improve Replit's AI agent, converting agent traces into product-changing insights for engineering and leadership.
Airwallex is hiring a Senior Data Scientist to lead Marketing analytics—building predictive, causal and attribution models to shape go-to-market strategy as part of the SF-based data science team.
Senior Data Scientist to own end-to-end finance and GTM analytics—turning revenue, pipeline, and customer lifecycle data into decision-ready signals and executive-grade insights.
Lead the ML stack as a founding Machine Learning Engineer at a stealth, self-funded AI group, defining models, training pipelines, and scalable inference for a global consumer product.
A founding AI/ML research engineer role to design and build core model, data, and inference systems for a stealth, high-impact consumer AI product backed by a profitable US$2B group.
Lead the technical design and implementation of A1’s foundational LLM systems—training pipelines, inference stacks, and deployment architecture—for a global consumer AI product.
Virtue AI seeks a Research Scientist Intern in San Francisco to develop and integrate cutting-edge agent and LLM security techniques, including red-teaming, guardrail models, and efficient inference methods.
Lead a high-impact engineering team at Verkada responsible for embedded Linux camera software, cloud integration, video streaming, and production AI features while driving roadmap, quality, and hiring.
Lead the architecture and delivery of Faire's machine-learning platform, building scalable feature stores, model serving, and inference infrastructure to power production ML across the marketplace.
NVIDIA seeks an entry-level Deep Learning Software Engineer to help optimize and ship GPU-accelerated inference software for LLMs and generative AI.
Build and productionize multi-step AI agents and the backend infrastructure that powers PermitFlow’s pre-construction platform in a fast-moving, hybrid NYC startup.
Lead the applied LLM systems effort at Plaud to design reasoning pipelines, productionize RAG and memory features, and optimize model inference for reliable, user-centered AI experiences.
Lead the design, training, and production deployment of ASR, TTS, and Speech LLM systems at OutcomesAI to power HIPAA-compliant voice agents in clinical settings.
Senior Data Scientist needed to lead ML/AI projects, productionize models, and mentor teams at a remote-first US company represented by Jobgether.
Lead design and implementation of cloud-native, high-performance backend services and AI model-serving infrastructure for Palo Alto Networks' ATP Cloud team.
Snap Inc. is hiring a Data Scientist to turn large-scale product data into rigorous insights that drive product and business decisions.
Work at the intersection of research and engineering to build scalable synthetic data pipelines that directly improve the quality and efficiency of Cohere's language models.
Gcore is hiring a seasoned Pre-Sales Engineer (Cloud & AI) to lead technical engagements, solution design, and customer success for GPU and cloud infrastructure across the Americas.
Senior Machine Learning Platform Engineer to design and optimize feature pipelines, distributed training, and low-latency inference systems for a remote US team building production ML infrastructure.
Lead the design and deployment of enterprise optimization and ML solutions for Gap Inc.'s Product-to-Market operations, driving measurable impact across pricing, inventory, and assortment.
Lead a high-impact consumer data science organization that shapes product, marketing, ads, and business strategy through advanced modeling, experimentation, and analytics for a major US consumer platform.
Khan Academy is hiring a Senior Data Scientist to apply causal inference, experimentation, and advanced analytics to improve learner engagement and outcomes across their platform.
Lead the strategy and delivery of distributed inference, LLM integrations, and on-device ML features at webAI to enable privacy-first, enterprise-grade AI on the edge.
Lead the product direction for large-scale ML inference infrastructure, driving roadmap, customer-facing technical decisions, and delivery of reliable, high-throughput model serving solutions for a U.S.-remote team.
Lead development of high-performance, distributed LLM inference systems at Modular to enable fast, scalable, production-grade AI deployments.
Help design and operate scalable, multi-cloud LLM inference infrastructure at Modular as a Backend Engineer focused on distributed systems and ML inference.
Lead technical product strategy and execution for webAI’s distributed inference and on-device LLM platform, partnering closely with engineering and research to deliver enterprise-grade AI solutions.
Understood is hiring an Associate Director, Creative Operations Lead to drive data-informed creative and growth analytics, using modeling and experimentation to increase engagement and retention.
Lead the design and production deployment of generative and multimodal computer vision systems at Nexxa.AI, translating ambiguous customer needs into robust, scalable AI solutions.
Samsara is hiring a Senior Machine Learning Engineer to build scalable ML infrastructure and end-to-end ML applications that power real-world IoT products and improve operational safety and efficiency.
Lead and develop the Enterprise Measurement Strategy team to drive incrementality measurement, experimentation roadmaps, and actionable insights for Haus’s enterprise customers.
A PhD research intern opportunity to design and execute generative-AI-driven human-in-the-loop experiments that inform Toyota Research Institute's behavior change and carbon-neutrality programs.
Early-career ML Operations / Full Stack engineer to help design, deploy, and optimize scalable model serving and training infrastructure for Abridge’s AI-driven healthcare platform.
Senior Software Developer to drive low-level, high-performance AI networking and inference infrastructure using C/C++/Rust, GPU kernels and RDMA at NVIDIA.
Gridware is hiring a Data Analyst to lead fleet change management and regression-based impact analysis to keep thousands of IoT devices healthy, performant, and scalable.
Build secure, scalable infrastructure and governance systems for enterprise AI agents as a Software Engineer on Rubrik's Agent Cloud team.
Lead the next generation of AI-driven ranking and recommendation systems for LinkedIn's Feed to improve relevance, personalization, and member engagement at massive scale.
Lead the development and deployment of high-performance, real-time computer vision and multi-sensor AI for smart home devices at TP-Link Systems Inc.
d-Matrix is hiring a Senior Staff ML Researcher to develop and implement algorithmic and numerical techniques that optimize LLM inference on next-generation DNN accelerators at its Santa Clara hybrid headquarters.