We are now looking for a TensorRT-LLM Software Development Engineer!
NVIDIA is hiring software engineers for its TensorRT-LLM team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning-powered AI, enabling breakthroughs in areas like LLM, ChatGPT and Generative AI that have put DL at the “iPhone moment” for AI. Join the team which is building the inferencing software which is foundational to product lines within NVIDIA and across the industry! The ability to work on a fast-paced delivery-focused team is required and excellent interpersonal skills are a must.
What you'll be doing:
Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
Perform benchmarking, profiling, and system-level programming for GPU applications.
Closely follow academic developments in the field of artificial intelligence and feature update TensorRT
Provide code reviews, design docs, and tutorials to facilitate collaboration among the team.
Conduct unit tests and performance tests for different stages of the inference pipeline.
Collaborate across the company to guide the direction of machine learning inferencing, working with software, research and product teams
Write safe, scalable, modular, and high-quality (C++/Python) code for our core backend software for LLM inference.
Improve the usability of the TensorRT-LLM library and build systems (CMake)
What we need to see:
Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)
4+ years of relevant software development experience.
Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design.
Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning like LLMs, generative and recommender models
Experience working with deep learning frameworks like TensorFlow and PyTorch
Self-starter who consistently takes initiative to drive projects forward
Excellent written and oral communication skills in English
Ways to stand out from the crowd:
Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation
Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application
Architectural knowledge of CPU and GPU
GPU programming experience (CUDA or OpenCL)
NVIDIA is widely considered to be one of technology’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. Does the idea of contributing to and pushing the boundaries of state-of-the-art AI and Compute systems excite you? Interested in getting exposure to the entire DL SW stack? Come join us and help build the GPU-accelerated DL platform used worldwide.
#LI-Hybrid
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead cross-functional programs for NVIDIA's GPU communication libraries to deliver high-performance compute software and customer-facing releases for HPC and deep learning workloads.
Lead TPMs to deliver a resilient, high-performance DGX Cloud AI/ML platform that accelerates NVIDIA research by integrating hardware, orchestration, and developer productivity.
NVIDIA seeks a Lead Senior Software Engineer to design and deliver industry-leading agentic AI blueprints and scale GenAI applications for enterprise deployment.
DomainTools is hiring a Senior Software Engineer to build and operate cloud-native, near-real-time data systems that power leading security analytics and investigations.
Freshworks is hiring a senior Frontend AI UI Engineer to create interactive visual editors, debugging tools, and multi-channel deployment interfaces for its Agentic AI Platform in Bellevue, WA.
NeoWork is hiring a Senior Back-end Engineer to improve tool-use reliability by debugging systems, building test frameworks, and producing clear technical documentation for a remote, contractor-based engineering team.
Build high-performance web and blockchain tooling at N1 as a Full-Stack Software Engineer responsible for frontend, backend, and database systems.
Lead architecture and full-stack delivery for Collective Health's payments platform, building scalable, secure services that manage employer-sponsored benefits transactions.
Senior Software Engineer needed to architect and build scalable, user-facing systems and generative AI integrations that improve government services and public programs.
Parker is hiring a Senior Backend Engineer to lead design and scaling of cloud-native backend systems powering its financial platform for eCommerce merchants.
Lead a distributed engineering team building high-performance Rust microservices for Kraken Pro’s trading platform and help drive the roadmap for secure, scalable trading infrastructure.
Experienced, hands-on engineering leader needed to scale and own Core platform architecture and reliability while directly contributing to code, design, and team growth at a remote-first healthcare recruitment company.
Lead the design and deployment of Zuora Billing & Revenue solutions for a global organization, driving I2C and revenue process optimization while ensuring audit and SOX compliance.
Experienced backend engineer sought to architect and ship secure, scalable APIs and backend systems for a climate-focused fintech building tokenized energy infrastructure.
Lead and grow Harvey's Frontend Platform team to design, build, and operate a shared component library and tooling that powers consistent, performant front-end experiences across the product.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
263 jobs