Browse 28 exciting jobs hiring in Inference Engineer now. Check out companies hiring such as NVIDIA, Rackspace, Palo Alto Networks in New Orleans, Montgomery, Dallas.
Work on NeMo Retriever to optimize and containerize LLM/MLLM models and build MLOps pipelines that deliver low-latency, production-grade inference for retrieval-augmented AI systems.
Work with Rackspace customers to deploy, optimize, and operationalize LLM/ML model-serving platforms in private and hybrid cloud environments to meet latency, throughput, security, and cost SLAs.
Palo Alto Networks is hiring a Senior Staff AI Engineer to lead design and delivery of enterprise-grade AI/ML solutions and platform capabilities across the organization.
A hands-on ML/AI Engineer role to architect and productionize hybrid ML and LLM-driven systems that extract structured workflow understanding from noisy enterprise data at scale.
ClarityPay is looking for a Senior Machine Learning Engineer to build and deploy Reinforcement Learning, bandit, and Bayesian optimization solutions that drive operational improvements in collections and offer optimization.
Experienced ML engineer needed to lead deployment, optimization, and scaling of computer vision models across cloud and edge environments for a fast-growing computer-vision platform.
High-impact backend engineer role building production ML/agent infrastructure and distributed systems to power AI-driven compliance for banks and fintechs.
Lead the design and delivery of secure, scalable, cloud-agnostic ML platform infrastructure and pipelines to enable reliable, explainable AI and analytics for public health at Altarum.
NVIDIA is hiring a Software Engineer, ML to optimize state-of-the-art ML training and inference across GPU hardware and software stacks.
High-impact ML Engineer role building search, ranking, and personalization systems for a fast-growing consumer shopping platform with strong user retention and competitive equity.
Lead the architecture and engineering of Amigo's backend platform to scale real-time inference, multi-LLM orchestration, and secure EHR integrations across millions of conversations.
Lead the ML stack as a founding Machine Learning Engineer at a stealth, self-funded AI group, defining models, training pipelines, and scalable inference for a global consumer product.
A founding AI/ML research engineer role to design and build core model, data, and inference systems for a stealth, high-impact consumer AI product backed by a profitable US$2B group.
Lead the technical design and implementation of A1’s foundational LLM systems—training pipelines, inference stacks, and deployment architecture—for a global consumer AI product.
Virtue AI seeks a Research Scientist Intern in San Francisco to develop and integrate cutting-edge agent and LLM security techniques, including red-teaming, guardrail models, and efficient inference methods.
Lead the architecture and delivery of Faire's machine-learning platform, building scalable feature stores, model serving, and inference infrastructure to power production ML across the marketplace.
Lead design and implementation of cloud-native, high-performance backend services and AI model-serving infrastructure for Palo Alto Networks' ATP Cloud team.
Work at the intersection of research and engineering to build scalable synthetic data pipelines that directly improve the quality and efficiency of Cohere's language models.
Gcore is hiring a seasoned Pre-Sales Engineer (Cloud & AI) to lead technical engagements, solution design, and customer success for GPU and cloud infrastructure across the Americas.
Senior Machine Learning Platform Engineer to design and optimize feature pipelines, distributed training, and low-latency inference systems for a remote US team building production ML infrastructure.
Lead development of high-performance, distributed LLM inference systems at Modular to enable fast, scalable, production-grade AI deployments.
Help design and operate scalable, multi-cloud LLM inference infrastructure at Modular as a Backend Engineer focused on distributed systems and ML inference.
Understood is hiring an Associate Director, Creative Operations Lead to drive data-informed creative and growth analytics, using modeling and experimentation to increase engagement and retention.
Samsara is hiring a Senior Machine Learning Engineer to build scalable ML infrastructure and end-to-end ML applications that power real-world IoT products and improve operational safety and efficiency.
Early-career ML Operations / Full Stack engineer to help design, deploy, and optimize scalable model serving and training infrastructure for Abridge’s AI-driven healthcare platform.
Build secure, scalable infrastructure and governance systems for enterprise AI agents as a Software Engineer on Rubrik's Agent Cloud team.
Coinbase is hiring a Machine Learning Platform Engineer to design and operate low‑latency inference, streaming pipelines, and distributed training infrastructure that powers fraud detection, personalization, and blockchain analysis.
Phare (part of R1) is hiring ML Engineers to build the internal training, benchmarking, and deployment infrastructure that turns research models into production-ready systems for healthcare revenue operations.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|