NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s an outstanding legacy of innovation driven by extraordinary technology and amazing people. NVIDIA is looking for a highly motivated SRE Engineer to join the NVIDIA AIR team – the Digital Twin for Data Center Simulation web application. NVIDIA AIR enables cloud-scale efficiency by creating identical replicas of real-world data center infrastructure deployments. To learn more, visit NVIDIA AIR.
What you'll be doing:
Design, deploy, and manage IaaS platforms with a focus on high availability and performance.
Automate infrastructure operations using tools like Terraform, Ansible, and Python.
Focus on efficiency by automating repetitive workflows.
Develop monitoring and observability tooling to detect and prevent outages using Prometheus, Grafana, ELK, etc.
Deploy and troubleshoot non-disruptive cloud operations with an emphasis on secure production infrastructure.
Manage deployment/upgrades for Operating Systems, Kubernetes (k8s) clusters, and other orchestration tools.
Provide day-to-day support for engineering activities with CI/CD tools like Git and Jenkins.
Implement and enforce best practices around infrastructure security, access control, and operational efficiency.
What we need to see:
BS degree in Computer Science, Software Engineering, or a related field (or equivalent experience).
5+ years of experience in a Site Reliability, DevOps, or Systems Engineering role.
Strong automation and scripting skills in Ansible, Python, and Shell Scripting.
Experience in IaaS environments, including deploying, configuring, and administering Linux-based bare metal servers.
Deep experience in infrastructure engineering, focused on managing and monitoring a highly available production infrastructure.
Skilled in observability practices, using Prometheus, Grafana, ELK/EFK, and integrated alerting systems.
Solid grasp of Linux internals and core networking concepts including NAT, DNS, DHCP, routing, and firewall configuration with iptables or nftables.
Experience with modern deployment architecture for non-disruptive cloud operations, including blue-green and canary rollouts.
Proficiency in Kubernetes, Docker, QEMU, and Libvirt.
Ways to stand out from the crowd:
Hands-on expertise with AWS, including deploying complex, load-balanced, and highly available workloads.
Proficiency in debugging network issues in both infrastructure and SDN.
Experience with performance tuning and benchmarking across storage, compute, or networking.
Implemented robust metrics collection and alerting infrastructure.
Familiar with compliance standards such as FedRAMP, HIPAA, and SOC 2.
With competitive salaries and a generous benefits package (www.nvidiabenefits.com ), we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our best-in-class engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.You will also be eligible for equity and benefits.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead global sourcing and supplier strategy for connectors, cables, and optical components to secure supply resiliency, reduce cost, and accelerate NPI at NVIDIA.
NVIDIA seeks an experienced Senior Partner Marketing Manager to drive co-marketing, demand generation, and partner engagement across its Data and Storage partner ecosystem.
Lead application security strategy and hands-on vulnerability research for a remote-friendly North American engineering organization, shaping secure development practices and mentoring security engineers.
Lead development of Speechify’s Android app, shipping high-quality, user-focused features using Kotlin and modern Android architecture.
Kyivstar.Tech seeks an experienced React Native Mobile Developer to design performant cross-platform mobile UIs, contribute to architecture, and drive product development within a collaborative engineering team.
Senior Full Stack Software Engineer needed to build and operate cloud-integrated 3D/CAD tools and internal operational workflows for a remote-first North America team.
Owl.co is hiring a Senior Software Engineer (Infrastructure) to design and operate scalable AWS infrastructure and automation for a market-leading AI insurance platform.
Platform-focused Software Engineer (TypeScript/Node) to design, build, and maintain scalable backend APIs and services powering Speechify’s consumer and B2B products.
Build and operate secure, scalable cloud-native systems and full-stack applications for an AI startup serving consumer brands, working across frontend, backend, and infrastructure.
Lead a cross-functional Ads AI engineering team at LinkedIn to design LLM-driven ad formats, build experimentation infrastructure, and mentor top engineering talent.
Booz Allen is hiring a full-stack Software Engineer to design and deliver end-to-end software solutions for client systems in Rome, NY.
DMV IT Service LLC is hiring a Senior Full Stack Engineer to deliver scalable, secure web and mobile features on a digital platforms team in Smithfield, RI.
Boeing is hiring a Software Technical Analyst (Associate or Experienced) to help design, test, document, and integrate software for Vertical Lift aircraft at Ridley Park, PA.
Technical Solutions Architect needed to lead architecture and implementation of secure, cloud-based data pipelines, dashboards, and reporting for a nationwide Medicaid transportation program.
Tabs is hiring a Senior Site Reliability Engineer to drive infrastructure reliability, observability, and automation across its AWS and Vercel environments as the company scales.
NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.
202 jobs