We are seeking a highly skilled Principal Network Engineer to join our dynamic team to build the next generation of IT AI Clusters and help lead the team through a major technology transformation into running AI on-prem and build infrastructure by integrating Enterprise ready platforms while building a solid foundation with automation. We are looking for a passionate engineer who will solve networking problems for scalable AI clusters.
This is a hands-on network engineering position focused on the architecture, design, development and deployment of ultra-high-speed, resilient, and scalable DC AI Clusters and Interconnects for GPU-accelerated data centers and compute clusters. Outstanding problem-solving abilities and a comprehensive understanding of the network security protocols & standards, routing, switching, automation and deep understanding of fundamental network theory is also critical to your success at NVIDIA.
What you will be doing:
Lead the architecture, design, and deployment of global-scale DCs inter-connects and fabric for HPC, AI, and GPU computing clusters.
Develop high-performance data center fabric using InfiniBand, Ultra Ethernet and related technologies.
Optimize carrier interconnects, intra and inter DC routing, and dark fiber deployments to ensure low latency and high reliability.
Partner with system, OS, GPU, and HPC teams to deliver scalable, highly available networks for extreme-performance workloads.
Implement network monitoring, telemetry, solving, and continuous performance improvement processes.
Drive technology selection, vendor engagement, and lifecycle management for Data Center hardware and software.
Collaborate with internal product managers develop NVIDIA on NVIDIA solutions
What we need to see:
MS or PhD in Electrical Engineering, Computer Science, Computer Engineering, Artificial Intelligence, Data Science, Mathematics, Statistics, or equivalent experience.
12+ years of experience in building, managing and supporting large scale hybrid networks, developing automation pipelines with Python, Ruby, Go or other languages used in infrastructure automation.
Expert in networking technologies: InfiniBand, Ultra Ethernet, ROCEv2, DCQCN, TCP/UDP, IPv4/IPv6, BGP/MP-BGP, VPN, L2 switching, EVPN, VxLAN, Segment Routing, MPLS.
Experience automating network infrastructure
Experience using an automated configuration management system (Python,Terraform, Chef, Puppet, Ansible, Salt, etc.)
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables outstanding creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for exceptional people like you to help us accelerate the next wave of artificial intelligence.
#LI-Hybrid
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 248,000 USD - 391,000 USD.You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
NVIDIA seeks a Senior Solutions Architect to help hyperscale cloud customers design and optimize GPU-based AI/ML and HPC solutions at scale, providing technical leadership, performance analysis, and customer-facing engineering support.
Lead customer-facing AI infrastructure deployments at NVIDIA, advising on GPU servers, networking, cluster bring-up, and performance debugging to enable large-scale AI systems.
T-robotics seeks a skilled Robotics Integration Engineer to deliver end-to-end integration, validation, and field deployment of industrial robotic systems at our Fremont site.
Lead utilities, electrical systems, and site energy programs at Pfizer's Bothell facility to drive reliability, compliance, and energy efficiency across GMP operations.
Hadrian is hiring a Manufacturing Engineer (Welding, Special Projects) to lead welding scoping, CapEx planning, and automation guidance for large-scale factory builds supporting aerospace and defense customers.
RRD is hiring a Packaging Technical Services Specialist to diagnose production issues, run product trials, and support clients and internal operations for folding carton and packaging projects.
At Until, lead end-to-end mechanical, electrical, and software engineering to develop N-of-1 hardware and systems for demonstrating whole-body reversible cryopreservation in rodents.
Relativity Space is hiring an early-career Launch Fluids Engineer to design and commission launch pad propellant and gas systems for Terran R at Cape Canaveral.
Support and advance metals manufacturing research, workforce development trainings, and industry partnerships at Penn State Behrend in a part-time, hands-on role.
Lead end-to-end infrastructure design and permitting for Relativity Space’s Wormhole factory, turning operational requirements into code-compliant, buildable construction packages.
Experienced Siemens NX Administrator needed to own CAD environment stability and standards, automate workflows, enable engineering users, and connect CAD data into the broader digital thread for a fast-paced aerospace company.
Serve as the technical security advisor to GTM and customers at Zapier, guiding security reviews, mitigating risks, and shaping product security work to accelerate trust in high-value deals.
TKDA is hiring a summer Civil Engineering intern to assist on municipal water and wastewater infrastructure projects from its Bloomington, MN office.
Zeeco is hiring a Project Engineer in Tulsa to oversee vapor equipment projects from design confirmation through delivery, ensuring scope, schedule and budget are met.
Lead the design and scaling of test systems and a test engineering team for a high-power energy storage product line at Redwood Materials.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
168 jobs