Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Site Reliability Engineer, AI Infrastructure image - Rise Careers
Job details

Senior Site Reliability Engineer, AI Infrastructure - job 1 of 2

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you! NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for over 30 years. It’s a unique legacy of innovation that’s fueled by phenomenal technology and outstanding people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAn, you’ll be immersed in a diverse, encouraging environment where everyone is inspired to do their best work. Come join the AI Infrastructure Production engineering team and see how you can make a lasting impact on the world.
 

What You Will Be Doing:

  • Develop and maintain large-scale systems supporting critical use cases for AI Infrastructure, driving reliability, operability, and scalability across global public and private clouds.

  • Implement SRE fundamentals, including incident management, monitoring, and performance optimization, while designing automation tools to reduce manual processes and operational overhead.

  • Build tools and frameworks to improve observability, define actionable reliability metrics, and enable fast issue resolution, driving continuous improvement in system performance.

  • Establish frameworks for operational maturity, lead sustainable incident response protocols, and conduct blameless postmortems to improve team efficiency and system resilience.

  • Work with engineering teams to deliver innovative solutions, mentor peers, uphold high standards for code and infrastructure, and contribute to hiring for a diverse, high-performing team.
     

What We Need to See:

  • Degree in Computer Science or related field, or equivalent experience with 8+ years in Software Development, SRE, or Production Engineering.

  • Proficiency in Python and at least one other language (C/C++, Go, Perl, Ruby).

  • Expertise in systems engineering within Linux or Windows environments and cloud platforms (AWS, OCI, Azure, GCP).

  • Strong understanding of SRE principles, including error budgets, SLOs, SLAs, and Infrastructure as Code tools (e.g., Terraform CDK).

  • Hands-on experience with observability platforms (e.g., ELK, Prometheus, Loki) and CI/CD systems (e.g., GitLab).

  • Strong communication skills with the ability to convey technical concepts effectively to diverse audiences.

  • Commitment to fostering a culture of diversity, curiosity, and continuous improvement.
     

Ways to stand out from the crowd:

  • Experience in AI training, inferencing, and data infrastructure services.

  • Proficiency in deep learning frameworks like PyTorch, TensorFlow, JAX, and Ray.

  • A strong background in hardware health monitoring and system reliability.

  • Hands-on expertise in operating and scaling distributed systems with stringent SLAs, ensuring high availability and performance.

  • Proven experience in incident, change, and problem management processes, fostering continuous improvement in sophisticated environments.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until August 31, 2025.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA Glassdoor Company Review
4.6 Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon Glassdoor star icon
NVIDIA DE&I Review
No rating Glassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star iconGlassdoor star icon
CEO of NVIDIA
NVIDIA CEO photo
Jensen Huang
Approve of CEO

Average salary estimate

$270250 / YEARLY (est.)
min
max
$184000K
$356500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User
Posted 24 hours ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA seeks a Senior Technical Program Manager in Santa Clara to lead software program execution for automotive customers, driving releases, issue resolution, and cross-functional coordination.

Photo of the Rise User
Posted 19 hours ago
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA is hiring a Senior Solutions Architect (Networking) to lead technical pre-sales and integrate compute and networking solutions for AI and data center customers.

Photo of the Rise User
GridUnity Hybrid No location specified
Posted 8 hours ago

GridUnity is looking for a Senior Full Stack Engineer to design and deliver scalable, data-driven features that power a mission-critical platform for grid interconnection and energy customers.

Photo of the Rise User
Audinate Hybrid Remote, United States
Posted 4 hours ago

Lead development of Iris’s browser-based React/TypeScript single-page application to deliver high-performance, AI-driven camera control and video experiences.

Photo of the Rise User
Posted 23 hours ago

Allergan Aesthetics is looking for a Senior Software Engineer to deliver scalable, secure platform services using TypeScript/Node.js, GraphQL and AWS technologies.

Photo of the Rise User

Build and operate the backend systems that bridge cloud and factory-floor hardware to power the digital backbone of Base’s first manufacturing facility.

Photo of the Rise User
Posted 14 hours ago

PointClickCare seeks an experienced Principal AI Engineer to lead architecture and delivery of agentic AI systems that drive safe, scalable AI adoption across its healthcare platform.

Posted 17 hours ago

Lead architecture and delivery as a Senior Software Engineer at Opus, building scalable full-stack systems that power an accessible training platform for deskless workers.

Photo of the Rise User
BitGo Hybrid New York, United States
Posted 17 hours ago

BitGo is hiring an onsite Software Engineer for the Onboarding team to design and build scalable, API-first backend services powering crypto product integrations.

Photo of the Rise User
Posted 3 hours ago

Lead the design and operation of the hybrid infrastructure and high-bandwidth telemetry systems that enable rapid, reliable vehicle testing and integration at REGENT.

Photo of the Rise User

Prime Robotics seeks a strategic VP to lead software engineering, support engineering, and IT for its warehouse robotics solutions, combining technical leadership with hands-on customer implementation.

Photo of the Rise User
PingWind Hybrid Fort Knox, KY /Remote
Posted 21 hours ago

Experienced Senior Software Developer with a Secret clearance needed to architect, develop, and lead cloud-enabled .NET and Angular applications for federal clients.

Photo of the Rise User
Central Hybrid No location specified
Posted 9 hours ago

Lead and scale Central’s engineering organization as Head of Engineering, building distributed teams, defining culture and processes, and driving reliable delivery for a mission-focused HR/operations platform.

Posted 12 hours ago

candidate.fyi is looking for a backend-focused Software Engineer (Python/Django) to scale APIs, improve database performance, and contribute to AI-powered hiring features on a high-growth platform.

Posted 6 hours ago

Be a founding senior engineer at an early-stage fintech startup building agentic AI for capital markets, owning features from design to production and driving model-led product innovation.

NVIDIA is a publicly traded, multinational technology company headquartered in Santa Clara, California. NVIDIA's invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, and ignited the era of modern AI.

126 jobs
MATCH
Calculating your matching score...
BADGES
Badge ChangemakerBadge Diversity ChampionBadge Family FriendlyBadge Global CitizenBadge Work&Life Balance
CULTURE VALUES
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
BENEFITS & PERKS
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
August 28, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!