Browse 4 exciting jobs hiring in Tensorrt Llm now. Check out companies hiring such as NVIDIA, FM, USAA in Minneapolis, St. Paul, Corpus Christi.
Work on NVIDIA's TensorRT team to design and optimize high-performance inference software in C++, Python, and CUDA that enables state-of-the-art LLMs and generative AI on NVIDIA GPUs.
Lead the Triton Inference Server engineering team at NVIDIA to deliver high-performance, scalable model-serving solutions for cloud and on-premises AI deployments.
Lead the end-to-end architecture and technical strategy for the NIM Factory to deliver enterprise-grade, GPU-accelerated inference services at scale.
Lead a high-impact team accelerating LLM inference performance at NVIDIA by combining deep systems expertise, GPU profiling, and cross-functional collaboration.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|