Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Site Reliability Engineer image - Rise Careers
Job details

Senior Site Reliability Engineer

The Role

We are looking for an SRE to join our infrastructure team. This role will be responsible for ensuring the reliability of our back-end systems, working with engineers who develop them, and planning for our future growth. Our core infrastructure relies heavily on Kubernetes (K8s), Terraform, and GCP, but we care more about your ability to learn, adapt, and ship robust solutions than whether you've used these exact tools before.

You are a good fit if this describes you:

  • You possess a strong understanding of foundational cloud infrastructure (AWS/GCP/Azure) and Linux provisioning/management tools.

  • You know how to design for reliability and scale with minimal operational overhead.

  • You learn new technologies rapidly because you're excited by solving hard infrastructure challenges.

  • You've scaled infrastructure before and understand the tradeoffs that matter.

  • You think most infrastructure moves too slowly and could be way better automated and optimized.

  • You're comfortable diving into unfamiliar systems and making them work reliably.

  • You are a self-starter who executes quickly, takes ownership, and constantly seeks improvement.

What you'll do:

  • Develop and maintain our core Python platform for routing requests, orchestrating AI workloads, managing GPU server capacity, observability, and more.

  • Develop and maintain our infrastructure layer using Terraform and cloud provider APIs to manage our fleet of GPU workers across cloud and potentially bare metal environments.

  • Own and operate the technologies underpinning our platform, potentially including K8s, FluxCD, Nomad, Prometheus, Thanos, DataDog, Loki, distributed networking/storage, etc.

  • Architect and implement solutions that directly impact the performance and availability of services for millions of ComfyUI users.

  • Work closely with our core engineering team to design and build new infrastructure systems.

  • Help create the vision and lay the foundation for where our infrastructure should go in the next 1/2/5 years.

  • Help shape our technical direction and infrastructure best practices as we grow.

Requirements:

  • You have relevant experience as an SRE for a high tech startup.

  • Experience in participating in incident management processes.

  • Strong foundation and experience in managing cloud infrastructure (AWS, GCP, or Azure). Experience with bare metal is a plus.

  • Solid understanding of container orchestration (Kubernetes preferred) and CI/CD principles and tools.

  • Excellent communication skills.

  • Proven ability to learn fast and ship quality infrastructure code and configurations.

Nice to have:

  • You have excelled at a fast-paced, high-growth tech startup before or are extremely excited about being in one.

  • Experience specifically with GPU management, scheduling, and monitoring in a large-scale environment.

  • Experience with specific observability tools (DataDog)

What is ComfyUI?

ComfyUI, in the simplest form, is a node-based engine that enables users to generate visual content using AI. It is the most popular and powerful visual AI framework with more than 60k Github stars, thousands of 3rd party extensions and millions of users.

But this is not where it stops,

  • ComfyUI is and will continue to be a tool for artists of the future: a human equipped with AI who can be an order of magnitude more productive than before.

  • ComfyUI will continue evolve to further to democratize AIs for creatives, hobbyists, storytellers, studios, developers, and enterprises.

  • It empowers those who were not trained with the power of the brush to also be a painter and those who do to be a maestro.

  • It will become an Operating System for visual Gen AI.

  • It will unlock a world where no stories is left untold

  • Watch a short video on how the director Paul Trillo used ComfyUI to make an original animation - https://vimeo.com/1068496998?share=copy

About Us

We are a small, intense, and well-funded team in San Francisco who push ComfyUI and its ecosystem forward. Our team comes from Stability AI and Google and many contributed to the ComfyUI ecosystem way before working here.

Our organization is flat and there is no hierarchy, only categories: dev, arts, prod, ops, etc (and no, there is no one here with the title of Member of Technical Staff, it’s long and silly for a job title).

The only thing that matters is the quality of your cultural fit and execution. We work hard and demand a lot of each other. But we have fun: everyone is here to make something meaningful that will end up being our life’s work. If this mission excites you and you view yourself as a top-tier talent, your future latent self is waiting for you at Comfy.

Check out our Github and blog for what we’ve been working on. Our investors include Pace Capital, Chemistry, Abstract Venture, and Guillermo Rauch.

Average salary estimate

$165000 / YEARLY (est.)
min
max
$140000K
$190000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Posted 12 hours ago

Advance ComfyUI's AI infrastructure at scale as a Senior/Staff AI Cloud Infra Engineer, driving innovation in cloud and GPU resource management for millions of users worldwide.

Photo of the Rise User
Posted 20 hours ago

A great opportunity for a recent civil engineering graduate to begin a career in transportation engineering at Dewberry’s Rancho Cordova office.

Posted 6 hours ago

Experienced Web/SharePoint Developer needed to manage and develop SharePoint-based solutions for a Top Secret/SCI cleared government contract in Quantico, VA.

Photo of the Rise User
Posted 3 hours ago

RemoteVA PH invites experienced drafters proficient in Revit and AutoCAD to join their fully remote team delivering precise 2D/3D drafting and BIM modeling solutions.

Photo of the Rise User

Support mid-level and senior engineers at Dewberry in design, analysis, and reporting for a variety of building structures in an entry-level structural engineering role based in New York City.

Experienced Compliance Technical Manager with deep wireless device certification knowledge needed to lead testing and regulatory alignment at Element's Morgan Hill location.

Photo of the Rise User

Experienced Senior Mechanical Engineer with a PE license needed at Kimley-Horn to lead MEP design and project teams on healthcare projects in Orange, CA.

Photo of the Rise User
General Motors (GM) Hybrid Mountain View, California, United States of America
Posted 22 hours ago

A Senior Vehicle Systems Engineer role at General Motors, driving software deployment and vehicle controls for cutting-edge electric truck programs.

Engineering Internship opportunity at General Dynamics Mission Systems focused on national defense technical projects in Scottsdale, AZ.

Photo of the Rise User

Innovative Conversational AI Bot Developer role at TTEC Digital designing and implementing AI self-service applications with Google CCAI.

Contribute to cutting-edge defense technologies with GDMS as a Senior Electrical Engineer focusing on RF systems and innovative national security solutions in San Jose, CA.

Photo of the Rise User
General Motors (GM) Hybrid Warren, Michigan, United States of America
Posted 8 hours ago

Lead advanced propulsion system design and analysis at General Motors, applying 1D software tools to innovate electrified and conventional powertrains.

Photo of the Rise User
Posted 20 hours ago
Inclusive & Diverse
Empathetic
Collaboration over Competition
Growth & Learning
Transparent & Candid
Medical Insurance
Dental Insurance
Mental Health Resources
Life insurance
Disability Insurance
Child Care stipend
Employee Resource Groups
Learning & Development

Lead enterprise architecture initiatives at American Express, driving innovation and technical excellence in large-scale platform design and deployment.

Keycard Labs Hybrid No location specified
Posted 13 hours ago

Keycard is looking for an experienced Systems Engineer to help build and operate core systems foundational to a global AI identity network in a remote-first setup.

MATCH
Calculating your matching score...
FUNDING
DEPARTMENTS
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
HQ LOCATION
No info
EMPLOYMENT TYPE
Full-time, unknown
DATE POSTED
July 30, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!