NVIDIA’s AI and GPU software is at the forefront of computing fueling breakthroughs across deep learning, LLMs, and intelligent applications. Our team is building solutions for rapid development and deployment of GPU kernels for AI systems. We take the latest AI models, rigorously analyze them, develop and deploy high-performance GPU kernels that define model performance and integrate the derived techniques and methodologies into the tools that automate this process.
This role is a unique opportunity to shape the next generation of AI performance and efficiency. You will work hands-on with emerging AI models, collaborating across compiler, AI inference, and model performance teams. The focus is on building programming solutions that can be applied to concrete AI inference use cases to deliver real-world performance and development efficiency wins.
What you will be doing:
Analyze state-of-the-art AI models, identifying key performance bottlenecks and opportunities at the kernel level.
Develop, optimize, and evaluate both hand-tuned and compiler-generated kernels for inference workloads, balancing speed and flexibility.
Design and build high-level DSLs and innovative compiler infrastructure to increase kernel developer productivity while achieving near peak performance.
Collaborate with model AI inference and compiler teams to iterate on kernel fusion, auto tuning, and sophisticated GPU programming techniques.
Benchmark performance across real workloads, diagnose root causes, and rapidly deploy optimizations that maximize hardware utilization on NVIDIA platforms.
What we need to see:
Bachelor’s, master’s or PhD degree in Computer Science, Computer Engineering or related field, or equivalent experience.
At least 3+ years Strong C++ and/or Python programming skills for system and performance engineering.
Understanding of GPU architecture and proficiency in CUDA programming.
Intellectual curiosity and interest to solve exciting problems and deliver practical results in production environments.
Ways to stand out from the crowd:
Experience designing, developing and optimizing high-efficiency GPU kernels for modern AI workloads.
Experience building compilers, domain-specific languages, or automatic optimization systems
Familiarity with popular compiler, GPU programming and AI frameworks such as MLIR, LLVM, PyTorch, XLA, Triton or Cutlass.
Experience with AI/ML inference workloads and model performance analysis.
Strong communication skills and ability to collaborate in a cross-team environment.
You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead NVIDIA’s enterprise XR product and partner teams to define strategy, drive partner and product execution, and deepen engagement across the VR/AR and AI ecosystem.
Lead the integration of third-party infrastructure providers into NVIDIA’s operational systems and shape robustness for DGX Cloud as a Senior AI Infrastructure Engineer focused on cloud partnerships.
A seasoned Java Backend Developer is sought to architect and deliver scalable microservices and cloud-native backend solutions for high-impact client projects.
Veracyte seeks a Software Development & Support Engineer to build and support Python/AWS-based production systems that power its genomic diagnostics platform.
MongoDB is hiring a Senior Product Security Engineer to drive security strategy and product hardening for the MongoDB Server and related infrastructure through code review, threat modeling, and security program ownership.
CACI seeks a motivated Software Development Intern for Summer 2026 to work on Agile teams building and maintaining mission-focused software using Angular, Java/Spring Boot, and PostgreSQL in a hybrid Ashburn, VA role.
PointOne is looking for a New Grad Software Engineer in NYC to build AI-driven features for law firms, working closely with founders and customers to prototype and ship high-impact products.
Work on platform-level tooling and CI/CD to streamline development and scale BETA's software engineering productivity for electric aviation.
Senior Backend Engineer to design and build workflow automation, APIs, and internal tooling that scale Gametime’s marketplace operations and incorporate thoughtful AI-driven automation.
Sophia seeks a Senior Full Stack Engineer to design, build, and support mission-critical online education applications using Ruby, JavaScript, cloud platforms, and modern CI/CD practices.
Entry-level Associate Software Engineer to support and enhance Java-based solutions in a hybrid, employee-owned company with ESOP eligibility and strong benefits.
Headstart is hiring a Senior Software Engineer to architect and build scalable full-stack systems that integrate LLMs for both internal tools and customer-facing workflows.
Experienced engineering leader needed to set product vision, lead large distributed engineering teams, and drive lifecycle, quality and budget disciplines at scale in a remote, enterprise environment.
Inversion seeks a pragmatic Full Stack Blockchain Engineer in New York to build and scale blockchain-enabled products that connect traditional business processes with decentralized infrastructure.
Khan Academy is hiring remote Summer 2026 Software Engineer Interns to build impactful educational features (speech-to-text, UI work, classroom tools) using Go and JavaScript in a mission-driven, mentored internship.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
251 jobs