NVIDIA is now looking for AI Software Engineers for our GenAI Frameworks (Megatron Core and NeMo Framework) team. Megatron Core and NeMo Framework are open-source, scalable and cloud-native frameworks built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. Our GenAI Frameworks provide end-to-end model training, including pretraining, alignment, customization, evaluation, deployment and tooling to optimize performance and user experience.
In this critical role, you will expand Megatron Core and NeMo Framework's capabilities, enabling users to develop, train, and optimize models by designing and implementing the latest in distributed training algorithms, model parallel paradigms, model optimizations, defining robust APIs, meticulously analyzing and tuning performance, and expanding our toolkits and libraries to be more comprehensive and coherent. You will collaborate with internal partners, users, and members of the open source community to analyze, design, and implement highly optimized solutions.
What you’ll be doing:
Design and develop the GenAI open source Megatron Core and NeMo Framework
Solve large-scale, end-to-end AI training and inference challenges, spanning the full model lifecycle from initial orchestration, data pre-processing, and running of model training and tuning, to model deployment.
Work at the intersection of AI applications, libraries, frameworks, and the entire software stack.
Innovate and improve model architectures, distributed training algorithms, and model parallel paradigms.
Accelerate foundation model training and finetuning with mixed precision recipes and next-gen NVIDIA GPU architectures.
Performance tuning and optimizations of deep learning framework and software components.
Research, prototype, and develop robust and scalable AI tools and pipelines.
What we need to see:
MS, PhD or equivalent experience in Computer Science, AI, Applied Math, or related fields and 5+ years of industry experience.
Experience with AI Frameworks (e.g. PyTorch, JAX), and/or inference and deployment environments (e.g. TRTLLM, vLLM, SGLang).
Proficient in Python programming, software design, debugging, performance analysis, test design and documentation.
Consistent record of working effectively across multiple engineering initiatives and improving AI libraries with new innovations.
Strong understanding of AI/Deep-Learning fundamentals and their practical applications.
Ways to stand out from the crowd:
Hands-on experience in large-scale AI training, with a deep understanding of core compute system concepts (such as latency/throughput bottlenecks, pipelining, and multiprocessing) and demonstrated excellence in related performance analysis and tuning.
Expertise in distributed computing, model parallelism, and mixed precision training
Prior experience with Generative AI techniques applied to LLM and Multi-Modal learning (Text, Image, and Video).
Knowledge of GPU/CPU architecture and related numerical software.
Contributions to open source deep learning frameworks.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working with us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
NVIDIA is hiring a VLSI CAD Engineer to develop and automate scalable EDA/CAD flows for next-generation AI chip design in Santa Clara.
Lead the DGX Cloud infrastructure security engineering team at NVIDIA to define and operationalize multi-cloud security controls and drive security-first practices across AI infrastructure.
Join NVIDIA's routing team to design, implement, and test control-plane features for Cumulus Linux, working with protocols such as BGP/EVPN/OSPF and contributing to open-source routing projects.
Lead a full-stack engineering team at Abridge to design and deliver secure, HIPAA-compliant EHR integrations and interoperability features using modern web technologies.
Action1 is hiring a Custom Development Engineer to design and deliver client-specific automation, scripts, and reports while working remotely on EST hours.
Lead the design and development of secure, scalable Azure-based and serverless solutions that integrate enterprise systems and Power Platform components.
Senior Developer Infrastructure Engineer needed to design and operate CI/CD, developer tooling, and scalable cloud-native infrastructure for a remote-first US engineering organization.
Cambium Assessment is looking for a Senior Software Engineer I to deliver scalable, full‑stack .NET solutions and cloud-based services for high‑scale educational assessment systems.
Brightstar.AI is hiring a hands-on Software Engineer to develop cloud-native applications, APIs, automations, and AI-enabled integrations that accelerate digital transformation for mid-sized enterprises.
Work on NVIDIA's TensorRT team to design and optimize high-performance inference software in C++, Python, and CUDA that enables state-of-the-art LLMs and generative AI on NVIDIA GPUs.
Develop and integrate safety-critical embedded software for submarine control and weapons systems as an experienced Embedded Software Engineer on a naval systems engineering team.
Lead SmarterDx's security engineering efforts to harden cloud infrastructure, automate compliance, and protect sensitive healthcare data across engineering and platform teams.
Lead the technical design and implementation of Crusoe's managed Slurm service to enable scalable, GPU-accelerated AI and HPC workloads on Crusoe Cloud.
Frontend Engineer needed to architect and build a robust React/TypeScript UI platform that powers mission-critical clinical workflows and AI-driven insights in an in-office San Francisco team.
Bjak seeks an MLOps Engineer to run and scale open-source LLMs into production, optimizing for cost, latency, and reliability while working in a flexible hybrid model.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
172 jobs