Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Network Engineer, AI/ML Infrastructure image - Rise Careers
Job details

Network Engineer, AI/ML Infrastructure

About The Role


We're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the cutting edge of network technology—managing InfiniBand and ultra-high-speed Ethernet fabrics that connect NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, and hundreds of servers.


You'll be hands-on with the full lifecycle of our network infrastructure: planning, building, testing, deploying, and keeping everything running at peak performance. That means troubleshooting issues as they arise, monitoring network performance and throughput, developing automation to streamline operations, and working closely with HPC and ML teams to ensure they have the bandwidth they need. You'll also help us plan for future capacity and evaluate emerging network technologies as we scale to meet increasingly demanding workloads.



Responsibilities
  • Configure and maintain InfiniBand and high-speed Ethernet fabrics
  • Optimize network performance for RDMA, and GPU-to-GPU communication
  • Manage network switches (Mellanox, NVIDIA, Micas Networks)
  • Troubleshoot network bottlenecks and latency issues
  • Plan and execute network upgrades and expansions
  • Network security implementation (firewalls, VLANs, ACLs)
  • Collaborate on storage network optimizationInfrastructure monitoring


Minimum Qualifications
  • 4+ years of network engineering experience in production environments
  • Strong understanding of L2/L3 networking protocols (TCP/IP, BGP, OSPF, VLANs)
  • Hands-on experience with high-speed networking (100Gb+ Ethernet and InfiniBand)
  • Hands-on experience with network security (firewalls, ACLs, network segmentation)
  • Knowledge of HPC network topologies
  • Experience with InfiniBand fabrics including RDMA, RoCE, IPoIB
  • Strong troubleshooting and problem-solving skills


Preferred Qualifications
  • Experience in data center environments or AI/ML infrastructure
  • Hands-on experience with high-performance Ethernet switches (e.g., Broadcom Tomahawk), and latest InfiniBand switches (e.g., Nvidia/Mellanox)
  • Experience optimizing networks for GPU-to-GPU communication
  • Experience with open-source firewall solutions (OPNsense, pfSense, or similar)
  • Experience with network automation tools
  • Understanding of distributed storage networking (Ceph cluster networks)
  • Familiarity with network monitoring and observability tools (Prometheus, Grafana)
  • Knowledge of multi-site network connectivity and WAN optimization
  • Familiarity with cloud networking in at least one platform (AWS, GCP, or Azure) including VPC design, site-to-site VPN configuration, Direct Connect/ExpressRoute/Cloud Interconnect, hybrid cloud connectivity, and cloud-to-datacenter network integration


$150,000 - $250,000 a year

If you're a natural problem-solver with a passion for continuous learning, we'd love to hear from you.

Average salary estimate

$200000 / YEARLY (est.)
min
max
$150000K
$250000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
LangChain Hybrid San Francisco
Posted 15 hours ago

LangChain is looking for an IT Engineer in San Francisco to establish scalable IT systems, automate operations, and own identity, MDM, and core IT services.

Photo of the Rise User
Posted 5 hours ago

Lead a Lifecycle Support team at Leidos to manage large-scale hardware/software lifecycle operations, license control, and configuration management for critical national security systems.

Photo of the Rise User

Join Pure Storage’s ISS team as a Member of Technical Staff to drive storage automation, validate and repair server/storage hardware, and collaborate with datacenter operations to keep the production fleet reliable.

Photo of the Rise User
Posted 4 hours ago

An enterprise company in Chicago is hiring an onsite IT Support Analyst to deliver day-to-day desktop, endpoint, and user-facing technical support within established IT processes and SLAs.

osu Hybrid Ackerman Rd, 640 (2432)
Posted 14 hours ago

Lead and consult on complex health system application strategies and integrations to improve clinical operations, administrative workflows, and patient outcomes at The Ohio State University Wexner Medical Center.

Photo of the Rise User
Posted 5 hours ago

DreamWorks Animation is hiring a Systems Administrator, Operations to deliver on-site desktop and infrastructure support across Linux, macOS, and Windows for the Glendale studio while partnering with engineering teams on projects and automation.

Photo of the Rise User

As an Azure Cybersecurity Engineer at Accenture Federal Services, you will secure Azure cloud environments, deliver compliance solutions, and act as the primary security advisor for government-focused cloud implementations.

Photo of the Rise User
AeroVironment Hybrid 20521 Seneca Meadows Pkwy, Germantown, MD
Posted 21 hours ago

AeroVironment is hiring a Service Desk Technician III to lead complex incident resolution, mentor junior staff, and drive ITSM process improvements for enterprise end-user computing.

Photo of the Rise User
Posted 24 hours ago

Senior Computer Network Architect for Saalex Corporation to design and implement secure LAN/WAN and C6ISR network architectures in a hybrid role based in Saint Inigoes, MD.

Photo of the Rise User
Posted 15 hours ago

Senior Systems Engineer needed to lead enterprise infrastructure operations and modernization for a global healthcare IT organization in a fully remote capacity.

Photo of the Rise User
Awesome Motive Hybrid Remote - PST - Zone 1
Posted 16 hours ago

Experienced Business Systems Analyst needed to define and test minor software changes, analyze data and reports, and support clients in a remote PST-based role for Zenith American Solutions.

Photo of the Rise User
Posted 11 hours ago

Provide Windows and macOS technical support, patch and vulnerability management, and ServiceNow-based user assistance for Vericast's San Antonio hybrid team.

Photo of the Rise User
Saalex Hybrid No location specified
Posted 24 hours ago

Senior Computer Systems Analyst needed to analyze, design, and integrate complex C6ISR and IT systems for Saalex’s test range and engineering operations in a hybrid Saint Inigoes, MD role.

we are transforming how stories are told, knowledge is learned, and insights are gathered.

1 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
January 7, 2026
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!