NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
We are seeking a dedicated Base Command Manager (BCM) Engineer to support product deployments/escalations and collaborate with Engineering and our Field Organization.
What you'll be doing
Play a key role in NVIDIA’s NPI team, acting as the link between engineering and the NVIS field team for cluster deployment and management solutions.
Collaborate closely with engineering and product teams to review and influence design decisions for products centered around large-scale, BCM-managed clusters.
Evaluate changes in BCM and underlying OS/software stacks, communicating the impact to the field organization to maintain robust and scalable deployment workflows.
Define and relay detailed cluster management requirements to engineering, enabling the successful New Product Introduction (NPI) of next-generation GPU platforms.
Describe architectural and design changes, build clear and actionable tasks for the field, including standardized deployment guides, configuration standard methodologies, and validation workflows.
Validate complex cluster configurations including Slurm and Kubernetes orchestrators for performance, scalability, and resilience, ensuring they meet real-world customer scenarios.
Support the NPI team by bridging knowledge gaps, tracking progress, and aligning collaborators throughout the product development lifecycle.
Support NVIDIA's mission by ensuring our breakthrough technologies are successfully deployed for global customers and OEM partners.
What we need to see
Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
10+ years of experience in at least two of the following: HPC/large-scale cluster administration, Linux systems engineering, infrastructure automation (e.g., Ansible, Salt), or data center operations.
5+ years of direct, hands-on experience provisioning, managing, and fixing clusters using NVIDIA Base Command Manager (BCM).
Deep, practical knowledge of how Slurm and Kubernetes is coordinated, deployed, and managed by BCM, including workload submission and resource management.
Proficiency in Python and Bash scripting for automation, cluster validation, and workflow optimization.
In-depth experience with cluster management and monitoring tools (e.g., Prometheus, Grafana, DCGM, and similar observability stacks).
Outstanding written and verbal communication skills, with the ability to explain complex technical concepts to both technical and non-technical collaborators.
A customer-first attitude, self-motivation, and a proactive approach to leadership in diverse environments.
Ways to stand out from the crowd:
Proficiency with cluster networking including InfiniBand and Spectrum-X.
Experience with NVIDIA Mission Control.
Familiarity with CI/CD workflows in an infrastructure context, including tools such as Git, GitLab, and Jenkins.
Background in Professional Services, customer-facing deployment, and solutions optimization.
Industry certifications such as CKA/CKAD (Certified Kubernetes Administrator/Developer), RHCE, or other advanced Linux/HPC credentials.
NVIDIA is widely considered one of the world's most desirable employers in technology. We have some of the world's most forward-thinking and passionate people working for us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 176,000 USD - 276,000 USD for Level 5, and 208,000 USD - 327,750 USD for Level 6.You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
NVIDIA seeks an experienced Solutions Architect to help partners adopt and operationalize large-scale AI and HPC solutions across cloud and datacenter environments.
Lead and execute multi-channel AI marketing campaigns at NVIDIA, driving content, paid and organic media, email, events, and website optimizations to amplify NVIDIA's AI leadership.
Provide technical leadership for upstream tech transfer, troubleshooting, and process validation to support large-scale monoclonal antibody manufacturing at the Holly Springs site.
Lead the design, integration, certification, and field activation of recovery vessel systems to support Terran R launch and recovery operations at Cape Canaveral.
Work on OpenAI's Robotics team as a junior-to-mid electrical engineer designing, building, and bringing up electronics for next-generation robotic systems in San Francisco with mentorship from experienced industry engineers.
Philips seeks a Product Industrialization Engineer / Project Manager in Cambridge to drive industrialization, process validation and project execution for medical device products.
Starshield seeks a Supply Chain Engineer (PCB) to own supplier development, quality, and production readiness for PCBs and related components supporting national-security satellite programs.
AbbVie is hiring a Senior Project Engineer to lead and deliver complex engineering projects for drug product manufacturing and packaging at its Cincinnati site.
Experienced systems architect needed to define and evolve BlackSky's satellite ground segment architecture, integrating mission planning, image processing, software, and communications for high-frequency Earth observation operations.
Lead the technical execution of Anduril’s Space Based Interceptor program, overseeing spacecraft and payload design, qualification, integration, and on-orbit delivery to meet missile-defense mission needs.
Lead the design and implementation of flight guidance algorithms at Reliable Robotics, applying deep aerospace and software expertise to advance safe, automated flight.
SpaceX is hiring a Manufacturing Engineer to develop and scale injection molding processes for Starlink consumer hardware aimed at high-volume production.
Senior Flexonics Pathway is hiring a Project Engineer - Proposal to lead RFQ review, cost estimation and technical proposal development for industrial customers.
A Chantilly-based land development firm is hiring a Junior Project Engineer to assist with site design, drainage, permitting, and construction support across residential and commercial projects.
Lead cross-functional teams to deliver and commercialize Tenstorrent's RISC-V IP, ensuring smooth customer adoption and world-class technical delivery.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
109 jobs