Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior On-Device Model Inference Optimization Engineer image - Rise Careers
Job details

Senior On-Device Model Inference Optimization Engineer - job 1 of 2

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are seeking a highly-skilled Senior On-Device Model Inference Optimization Engineer to join our team and lead efforts in improving the performance and efficiency of AI models enabling the next generation of autonomous vehicles technology at NVIDIA!

What you'll be doing:

  • Develop and implement strategies to optimize AI model inference for on-device deployment.

  • Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands.

  • Optimize performance-critical components using CUDA and C++.

  • Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs.

  • Benchmark inference performance, identify bottlenecks, and implement solutions.

  • Research and apply innovative methods for inference optimization.

  • Adapt models for diverse hardware platforms and operating systems with varying capabilities.

  • Create tools to validate the accuracy and latency of deployed models at scale with minimal friction.

  • Recommend and implement model architecture changes to improve the accuracy-latency balance.

What we need to see:

  • MSc or PhD in Computer Science, Engineering, or a related field, or equivalent experience.

  • Over 10 years of confirmed experience specializing in model inference and optimization.

  • Expertise in modern machine learning frameworks, particularly PyTorch, ONNX, and TensorRT.

  • Proven experience in optimizing inference for transformer and convolutional architectures.

  • Strong programming proficiency in CUDA, Python, and C++.

  • In-depth knowledge of optimization techniques, including quantization, pruning, distillation, and hardware-aware neural architecture search.

  • Skilled in building and deploying scalable, cloud-based inference systems.

  • Passionate about developing efficient, production-ready solutions with a strong focus on code quality and performance.

  • Meticulous attention to detail, ensuring precision and reliability in safety-critical systems.

  • Strong collaboration and communication skills for working optimally across multidisciplinary teams.

Ways to stand out from the crowd:

  • Publications or industry experience in optimizing and deploying model inference at scale.

  • Hands-on expertise in hardware-aware optimizations and accelerators such as GPUs, TPUs, or custom ASICs.

  • Active contributions to open-source projects focused on inference optimization or machine learning frameworks.

  • Experience in designing and deploying inference pipelines for real-time or autonomous systems.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until October 10, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA Glassdoor Company Review
4.6 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
NVIDIA DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of NVIDIA
NVIDIA CEO photo
Jensen Huang
Approve of CEO

Average salary estimate

$270250 / YEARLY (est.)
min
max
$184000K
$356500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

Senior Technical Marketing Engineer needed to translate NVIDIA GPU and rack-scale system architecture into compelling technical content and customer-facing engagement for hyperscalers, OEMs, and system operators.

Photo of the Rise User
Posted 17 hours ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA seeks a Senior Technical Program Manager to drive delivery of SOC system software programs, coordinating hardware integration, test, validation, and cross-functional teams for automotive and embedded products.

Photo of the Rise User
Posted 3 hours ago

Anduril is hiring a Staff Software Engineer (active U.S. Secret clearance) to lead software development and integration for its Air Defense family of systems, combining autonomy, networking, and data-driven features for deployed operations.

Photo of the Rise User
Posted 1 hour ago

Help accelerate healthcare automation by developing robust browser automation and AI integrations that streamline clinical workflows at Commure + Athelas in Mountain View, CA.

Photo of the Rise User
NBCUniversal Hybrid 7580 Golf Channel Drive, Orlando, FL
Posted 9 hours ago

Experienced backend/full-stack engineer needed to build and maintain scalable services for Fandango's consumer-facing platforms as part of NBCUniversal's engineering organization.

Photo of the Rise User
Posted 2 hours ago

Work on production-grade AI agents at Sierra’s Atlanta office, owning the end-to-end lifecycle from pilot to deployment and partnering closely with enterprise customers to drive measurable outcomes.

Photo of the Rise User
Posted 10 hours ago

Peraton seeks a skilled software developer to build and integrate high-fidelity missile and radar simulation models for mission-critical defense environments.

Posted 16 hours ago

Lead the technical vision and build the AI-driven automation backbone for an early-stage IT operations platform while hands-on coding, deploying models, and growing a small engineering team in New York.

Photo of the Rise User
ServiceNow Hybrid Building A,B,C 2225 Lawson Lane, Santa Clara, CALIFORNIA, United States
Posted 4 hours ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead a machine learning engineering team at ServiceNow to build scalable, cloud-native AI/ML solutions that improve enterprise workflows and user experiences.

Photo of the Rise User
Posted 3 hours ago

Experienced distributed-systems engineer needed to lead architecture and development of Paxos' stablecoin and token issuance infrastructure, driving scalability, security, and cross-team technical excellence.

Photo of the Rise User
Jobgether Hybrid No location specified
Posted 19 hours ago

A Washington, D.C.–based partner is seeking a React Developer to deliver high-quality, performant web interfaces using modern front-end technologies in a remote, Agile team.

Posted 16 hours ago

LlamaIndex seeks an Agent Engineer to build production-quality agent capabilities, retrieval systems, and production SDK bridges that empower developers to build document agents and RAG applications.

Own and resolve the most complex L4 escalations for a leading AI storage platform, driving tooling, automation, and architectural recommendations to improve reliability and MTTR.

Photo of the Rise User
Jobgether Hybrid No location specified
Posted 18 hours ago

A seasoned Sr. RPG Developer is sought to deliver production-ready RPG/ILE solutions and provide technical leadership for iSeries applications supporting regulated business processes.

Photo of the Rise User

Support a NATO ACT program as a Full Stack Developer building containerized, microservices-based applications using modern JavaScript, Python, or Java frameworks.

NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.

178 jobs
MATCH
Calculating your matching score...
BADGES
Badge ChangemakerBadge Diversity ChampionBadge Family FriendlyBadge Global CitizenBadge Work&Life Balance
CULTURE VALUES
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
BENEFITS & PERKS
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
October 7, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!