Job details

Senior Software Engineer, AI Systems - vLLM and MLPerf

We are seeking highly skilled and motivated software engineers to join our vLLM & MLPerf team. You will define and build benchmarks for MLPerf Inference, the industry-leading benchmark suite for inference system-level performance, as well as contribute to vLLM and optimize its performance to the extreme for those benchmarks on NVIDIA's latest GPUs.

What you’ll be doing:

Design and implement highly efficient inference systems for large-scale deployments of generative AI models.
Define inference benchmarking methodologies and build tools that will be embraced across the industry.
Develop, profile, debug, and optimize low-level system components and algorithms to enhance the throughput and the latency for the MLPerf Inference benchmarks on the newest NVIDIA GPUs.
Productionize inference systems with uncompromised software quality.
Collaborate with researchers and engineers to productionize trending model architectures, inference techniques and quantization methods.
Contribute to the design of APIs, abstractions, and UX that make it easier to scale model deployment while maintaining usability and flexibility.
Participate in design discussions, code reviews, and technical planning to ensure the product aligns with the business goals.
Stay up to date with the latest advancements and come up with novel research ideas in inference system-level optimization, then translate research ideas into practical, robust systems. Explorations and academic publications are encouraged.

What we need to see:

Bachelor’s, Master’s, or PhD degree in Computer Science/Engineering, Software Engineering, a related field, or equivalent experience.
5+ years of experience in software development, preferably with Python and C++.
Deep understanding of deep learning algorithms, distributed systems, parallel computing, and high-performance computing principles.
Hands-on experience with ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Experience optimizing compute, memory, and communication performance for the deployments of large models.
Familiarity with GPU programming, CUDA, NCCL, and performance profiling tools.
Ability to work closely with both research and engineering teams, translating pioneering research ideas into concrete designs and robust code, as well as coming up with novel research ideas.
Excellent problem-solving skills, with the ability to debug sophisticated systems.
A passion for building high-impact software that pushes the boundaries of what’s possible with large-scale AI.

Ways to stand out from the crowd:

Background with building and optimizing LLM inference engines such as vLLM and SGLang.
Experience building ML compilers such as Triton, Torch Dynamo/Inductor.
Experience working with cloud platforms (e.g., AWS, GCP, or Azure), containerization tools (e.g., Docker), and orchestration infrastructures (e.g., Kubernetes, Slurm).
Exposure to DevOps practices, CI/CD pipelines, and infrastructure as code.
Contributions to open-source projects (please provide a list of the GitHub PRs you submitted).

At NVIDIA, we believe artificial intelligence (AI) will fundamentally transform how people live and work. Our mission is to advance AI research and development to create groundbreaking technologies that enable anyone to harness the power of AI and benefit from its potential. Our team consists of experts in AI, systems and performance optimization. Our leadership includes world-renowned experts in AI systems who have received multiple academic and industry research awards.

If you've hacked the inner workings of PyTorch, or if you've written many CUDA/HIP kernels, or if you've developed and optimized inference services or training workloads, or if you've built and maintained large-scale Kubernetes clusters, or if you simply just enjoy solving hard problems, feel free to drop an application!

#LI-Hybrid

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until October 12, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

vLLM MLPerf LLM inference CUDA PyTorch NCCL GPU programming Performance engineering C++ Python Quantization Profiling Triton Docker Kubernetes

NVIDIA Glassdoor Company Review

4.6

NVIDIA DE&I Review

No rating

CEO of NVIDIA

Jensen Huang

Approve of CEO

Average salary estimate

$270250 / YEARLY (est.)

min

max

$184000K

$356500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Senior Developer Relations Manager

NVIDIA Hybrid US, NC, Remote

VIEW

Posted 7 hours ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

Lead technical developer advocacy for NVIDIA’s Physical AI and generative AI platforms, helping partners integrate world foundation models and acceleration technologies into production solutions.

Senior Hardware Application Engineer, Cloud Service Providers

NVIDIA Hybrid US, CA, Santa Clara

VIEW

Posted 6 hours ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

NVIDIA is hiring a Senior CSP Application Engineer to lead system-level integration and optimization of GPU-accelerated server solutions with major cloud service providers.

Staff Software Engineer (Backend Java)

ServiceNow Hybrid Building A,B,C 2225 Lawson Lane, Santa Clara, California, United States

VIEW

Posted 1 hour ago

Inclusive & Diverse

Mission Driven

Rise from Within

Diversity of Opinions

Work/Life Harmony

Empathetic

Feedback Forward

Take Risks

Collaboration over Competition

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Resources

Life insurance

Disability Insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

Conferences Stipend

Paid Time-Off

Maternity Leave

Equity

ServiceNow is hiring a Staff Software Engineer (Backend Java) to drive scalable, high-performance platform features and architecture on the Core Platform team.

Engineering Tech Lead, AI Product

The Browser Company Hybrid No location specified

VIEW

Posted 21 hours ago

Technical product leader needed to architect and ship cutting-edge LLM-powered features for Dia, driving roadmap, quality, and team growth in a remote-first startup.

Mobile Application Architect

Lifted, an Upwork Company™ Hybrid San Francisco

VIEW

Posted 23 hours ago

Experienced Mobile Application Architect needed to perform cross-platform performance audits and deliver a modernization roadmap for Android, iOS, and React Native applications for a leading enterprise consulting client.

Software Engineer IV (E4)

Amat Hybrid Santa Clara,CA

VIEW

Posted 6 hours ago

Applied Materials is hiring a Software Engineer IV to develop high-performance C/C++ tools and image-processing algorithms for large-scale GDS/OASIS data preparation and mask data workflows in their Santa Clara engineering team.

iOS Software Engineer

LinkedIn Hybrid Mountain View, CA

VIEW

Posted 24 hours ago

Work on LinkedIn’s native iOS applications and developer tooling to deliver high-performance, scalable mobile features and improve engineering productivity across the organization.

Salesforce Developer

Houzz Hybrid Remote - US

VIEW

Posted 22 hours ago

Inclusive & Diverse

Collaboration over Competition

Growth & Learning

Passion for Exploration

Experienced Salesforce Developer wanted to design and implement scalable APEX, LWC, and API-driven solutions while supporting admins and cross-functional teams at Houzz.

Senior Director, Software Engineering (Machine Learning)

Affirm Hybrid Remote US

VIEW

Posted 5 hours ago

Inclusive & Diverse

Collaboration over Competition

Growth & Learning

Transparent & Candid

Lead Affirm’s centralized Machine Learning organization to define strategy, build talent and platforms, and deliver high-impact models that drive business outcomes across underwriting, fraud, servicing and personalization.

Solution Lead (Salesforce) - Remote

Mindex Hybrid No location specified

VIEW

Posted 23 hours ago

Experienced Salesforce technical leader needed to architect, develop, and guide enterprise Salesforce solutions while mentoring teams and delivering exceptional customer outcomes for a well-established software services firm.

SENIOR APPLICATION DEVELOPER

City of New York Hybrid New York, NY

VIEW

Posted 15 hours ago

Lead the design and delivery of secure, scalable .NET and Angular applications and CI/CD pipelines to support mission-critical services for New York City's Department of Social Services.

Software Engineer - Compute Market

The San Francisco Compute Company Hybrid San Francisco

VIEW

Posted 4 hours ago

Work on a core engineering team building the high-performance trading, pricing, and infrastructure systems that power a real-time marketplace for GPU/HPC compute.

Vice President, Data & AI

Mastercard Hybrid O'Fallon, Missouri

VIEW

Posted 15 hours ago

Inclusive & Diverse

Empathetic

Collaboration over Competition

Growth & Learning

Transparent & Candid

Mastercard is hiring a Vice President of Data & AI to architect and lead a cloud-first data and AI platform that powers scalable, market-facing analytics and GenAI products for Business & Market Insights.

Senior Robotics Software Engineer, Maneuver Dominance

Anduril Industries Hybrid Costa Mesa, California, United States

VIEW

Posted 1 hour ago

Senior Robotics Software Engineer to design, implement, and deploy mission autonomy systems and multi-asset coordination for Anduril’s Maneuver Dominance team in Costa Mesa, CA.

Forward Deployed Engineer

Awesome Motive Hybrid San Francisco

VIEW

Posted 9 hours ago

Work as a founding Forward Deployed Engineer at Simple AI to build and deploy voice AI agents for enterprise customers while closely partnering with product and customers in our SF office.

NVIDIA

NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.

194 jobs

MATCH

Calculating your matching score...

BADGES