Browse 25 exciting jobs hiring in Ai Inference now. Check out companies hiring such as Palo Alto Networks, Gcore, webAI in Fayetteville, Garland, Madison.
Lead design and implementation of cloud-native, high-performance backend services and AI model-serving infrastructure for Palo Alto Networks' ATP Cloud team.
Gcore is hiring a seasoned Pre-Sales Engineer (Cloud & AI) to lead technical engagements, solution design, and customer success for GPU and cloud infrastructure across the Americas.
Lead the strategy and delivery of distributed inference, LLM integrations, and on-device ML features at webAI to enable privacy-first, enterprise-grade AI on the edge.
Lead technical product strategy and execution for webAI’s distributed inference and on-device LLM platform, partnering closely with engineering and research to deliver enterprise-grade AI solutions.
Lead the design and production deployment of generative and multimodal computer vision systems at Nexxa.AI, translating ambiguous customer needs into robust, scalable AI solutions.
A PhD research intern opportunity to design and execute generative-AI-driven human-in-the-loop experiments that inform Toyota Research Institute's behavior change and carbon-neutrality programs.
Build secure, scalable infrastructure and governance systems for enterprise AI agents as a Software Engineer on Rubrik's Agent Cloud team.
Lead the next generation of AI-driven ranking and recommendation systems for LinkedIn's Feed to improve relevance, personalization, and member engagement at massive scale.
Lead the development and deployment of high-performance, real-time computer vision and multi-sensor AI for smart home devices at TP-Link Systems Inc.
Contribute to in-vehicle intelligence by building and deploying high-performance ML/DL models and MLOps pipelines for a leading automotive software platform.
Senior backend engineer role at Sprig to own and evolve large-scale data processing and AI inference systems that power product insights for leading companies.
Lead Developer Relations on the West Coast to grow Featherless’s open-model community, create technical demos and content, and represent the platform at events and hackathons.
Kilo seeks a technically fluent Senior Partnerships Manager to build and scale strategic relationships with model providers, infra partners, and devtool platforms for its open-source AI coding agent.
Join METR as a senior researcher to design and run experiments, build metrics, and analyze agent and human-subject data to advance rigorous AI capability measurement and risk assessment.
Contribute to Sprig’s AI-powered platform as a fullstack engineer focused on large-scale backend systems, distributed data workflows, and frontend integrations in a hybrid San Francisco role.
Lead end-to-end development of large-scale AI and deep learning solutions at Thomson Reuters Labs, driving production-grade LLM, retrieval, and data-pipeline capabilities across legal and news products.
Lead the Dynamo engineering team at NVIDIA to design, build, and operationalize high-performance, fault-tolerant LLM inference and GenAI serving infrastructure.
Work on the Platform Engineering team to design, build, and operate the multi-cloud platform and core systems that run Modular's AI inference services at scale.
Anduril is hiring a Software Engineer, AI in Reston to build, optimize, and deploy real-world ML/LLM systems that power mission-critical defense and intelligence capabilities.
Adobe is seeking 2026 Software Engineer Interns to design, develop, test, and deploy scalable services and features for Creative Cloud, Document Cloud, Experience Cloud, and Firefly in a co-located hybrid internship.
Lead the GenAI Platform engineering team at Abridge to design, deliver, and operate LLM workflows, agentic systems, and retrieval/evaluation infrastructure for clinical AI products.
Create high-quality, technically rigorous content about LLMs and AI infrastructure for developer and enterprise audiences in a remote, part-time contract role.
An engineer-focused, customer-facing role to architect, implement, and deploy production AI inference solutions on Baseten’s platform with hands-on coding and cross-functional ownership.
Capital One is hiring a Senior Lead AI Engineer to design and productionize foundational LLM, inference, and agentic AI systems that are scalable, cost-efficient, and responsible.
Help shape GPU-accelerated inference and AI infrastructure as a Spring intern working on CUDA, models, and scalable training/inference systems in San Francisco.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
2
|