Browse 18 exciting jobs hiring in Model Serving now. Check out companies hiring such as Fluency, POSH, Sciforium in Brownsville, Tulsa, Houston.
A hands-on ML/AI Engineer role to architect and productionize hybrid ML and LLM-driven systems that extract structured workflow understanding from noisy enterprise data at scale.
Posh is hiring a Senior Software Engineer, Personalization to productionize and scale recommendation and classification systems that power discovery on its SoHo-based live-events platform.
Lead the architecture and hands-on development of Sciforium’s high-performance model serving platform, spanning GPU kernels, runtimes, distributed scheduling, and Python APIs to deliver low-latency multimodal inference.
Abridge is hiring a Head of AI Platform to lead the team building scalable, secure ML infrastructure and model-serving systems that power its generative-AI healthcare products.
Lead the design and deployment of highly available backend services and MLOps infrastructure to productionize ML models at Credit Genie.
Lead the architecture and delivery of Faire's machine-learning platform, building scalable feature stores, model serving, and inference infrastructure to power production ML across the marketplace.
Senior Backend Software Engineer to design and deploy scalable, highly available services and APIs that power Credit Genie's AI-driven products and model serving infrastructure.
Lead design and implementation of cloud-native, high-performance backend services and AI model-serving infrastructure for Palo Alto Networks' ATP Cloud team.
Build and scale the compute and infrastructure that powers Chai Discovery's next-generation AI drug design platform as a Software Engineer, Infrastructure.
Work on Prime Video's Catalog Platform to build scalable image/video processing pipelines and production ML model serving that power customer experiences across millions of viewers.
Lead the architecture and execution of a high-throughput, low-latency ML and simulations platform that enables large-scale model training, inference, and simulation-driven product development.
Lead the product direction for large-scale ML inference infrastructure, driving roadmap, customer-facing technical decisions, and delivery of reliable, high-throughput model serving solutions for a U.S.-remote team.
NVIDIA is hiring a Senior Full-Stack Software Engineer to build and operate a high-performance web platform for scenario configuration and large-scale synthetic data generation for autonomous driving.
Lead the design and production deployment of advanced AI/ML risk-detection systems as a Staff Machine Learning Engineer focused on protecting users and the business from fraud, scams, and account takeover.
Help architect and ship robust LLM integrations for Cohere’s North platform, collaborating closely with researchers and engineers to improve performance, latency, and reliability.
Lead the technical vision and hands-on implementation of a portable, production-grade Model-as-a-Service platform at NVIDIA, driving full-stack design, deployment, and quality for high-throughput model serving.
Build and scale Whatnot's ML infrastructure to productionize cutting-edge models, enable low-latency LLM serving, and support distributed training at consumer scale.
Senior Machine Learning Engineer to architect and deliver high-performance ML systems and GPU-level kernels for a fast-growing AI infrastructure company backed by AMD support.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
16
|