Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Head of ML Cloud Platform image - Rise Careers
Job details

Head of ML Cloud Platform

📍 San Francisco | Work Directly with CEO & founding team | Report to CEO | OpenAI for Physics | 🏢 5 Days Onsite

Head of ML Cloud Platform

📍 San Francisco | Work Directly with CEO & Founding Team | Report to CEO | OpenAI for Physics | 🏢 5 Days Onsite

Location: Onsite in San Francisco
Compensation: Competitive Salary + Significant Equity

Who We Are

UniversalAGI is building OpenAI for Physics. AI startup based in San Francisco and backed by Elad Gil (#1 Solo VC), Eric Schmidt (former Google CEO), Prith Banerjee (ANSYS CTO), Ion Stoica (Databricks Founder), Jared Kushner (former Senior Advisor to the President), David Patterson (Turing Award Winner), and Luis Videgaray (former Foreign and Finance Minister of Mexico). We're building foundation AI models for physics that enable end-to-end industrial automation from initial design through optimization, validation, and production.

We're building a high-velocity team of relentless researchers and engineers that will define the next generation of AI for industrial engineering. If you're passionate about AI, physics, or the future of industrial innovation, we want to hear from you.

About the Role

As the Head of ML Cloud Platform, you'll be in the arena from day one, building and leading the team that creates the backbone for AI-powered physics simulation at scale. This is your chance to own the entire ML infrastructure vision—from training foundation models on petabytes of CFD data to deploying them into mission-critical automotive and maritime production environments.

You'll work directly with the CEO and founding team to build a world-class ML platform organization, recruiting exceptional engineers and researchers while remaining deeply technical yourself. You'll architect systems that train models faster, serve predictions with lower latency, and integrate seamlessly into customers' existing CAE workflows—all while managing a team that ships with the velocity of a startup and the rigor of enterprise infrastructure.

This isn't a pure management role. You're a technical leader who codes, debugs production incidents at 2 AM when needed, and earns respect through hands-on contribution while simultaneously building the team and culture that will scale our platform to serve the world's largest industrial companies.

What You'll Do

Technical Leadership & Architecture

  • Define the ML platform vision: Architect the end-to-end infrastructure strategy for training, fine-tuning, serving, and deploying foundation models for physics simulation across cloud and on-premise environments

  • Build for scale and reliability: Design systems that can handle petabyte-scale CFD datasets, multi-day distributed training runs, and real-time inference for customers making million-dollar engineering decisions

  • Stay hands-on: Write code, debug critical production issues, review pull requests, and make key architectural decisions yourself—you're a technical leader who leads by doing

  • Bridge research and production: Translate cutting-edge research from our deep learning team into production-grade infrastructure that customers can depend on

  • Integrate with CAE ecosystems: Ensure our platform works seamlessly with existing simulation tools (Ansys, OpenFOAM, STAR-CCM+), HPC clusters, PLM systems, and enterprise security requirements

Team Building & Management

  • Recruit world-class talent: Build a team of exceptional ML infrastructure engineers, cloud platform engineers, and MLOps specialists who can execute at the highest level

  • Develop and mentor: Coach engineers to grow technically and professionally, fostering a culture of deep work, technical excellence, and customer obsession

  • Scale the organization: Grow the team from founding engineers to a robust platform organization as we scale from early customers to enterprise deployments

  • Set technical standards: Establish engineering practices, code review processes, and quality bars that enable the team to ship fast without breaking things

  • Foster collaboration: Work closely with deep learning researchers, product engineers, CFD domain experts, and customer success to ensure platform capabilities align with company needs

Execution & Delivery

  • Ship relentlessly: Drive the team to deliver infrastructure from prototype to production in weeks, not quarters, iterating based on real customer feedback

  • Own reliability: Take responsibility for platform uptime, performance, and customer success—when things break, you're in the arena fixing them

  • Make strategic tradeoffs: Balance innovation with stability, speed with quality, and custom solutions with scalable platforms

  • Work with customers: Engage directly with automotive and maritime customers to understand their infrastructure requirements, security constraints, and deployment challenges

  • Build for enterprise: Implement security, compliance, monitoring, and operational practices that meet the standards of Fortune 500 companies

Qualifications

Required Experience

  • 8+ years in ML infrastructure or cloud platform engineering, with at least 3 years in technical leadership roles managing high-performing teams

  • Proven track record building and scaling ML platforms for training, serving, or deploying models in production environments, ideally at AI-first companies

  • Deep technical expertise in distributed training (PyTorch Distributed, DeepSpeed, Ray), cloud infrastructure (AWS/GCP/Azure), and container orchestration (Kubernetes, Docker)

  • Hands-on coding ability: Expert-level Python and infrastructure-as-code skills—you can still ship production code yourself and review your team's work deeply

  • Team building success: Track record of recruiting, developing, and retaining exceptional engineering talent, with experience building teams from 3-4 engineers to 15-20+

  • Strong product and customer intuition: Experience working closely with customers, understanding their workflows, and translating requirements into technical solutions

  • Outstanding execution velocity: Proven ability to ship infrastructure rapidly in fast-paced, high-growth environments while maintaining quality

Technical Requirements

  • ML infrastructure mastery: Deep understanding of training pipelines, model serving, distributed systems, GPU optimization, and the full ML lifecycle

  • Cloud platform expertise: Strong experience with cloud providers, infrastructure-as-code tools, and building hybrid cloud/on-premise solutions

  • System design excellence: Can architect complex, scalable systems and make smart tradeoff decisions under uncertainty

  • Performance optimization: Knowledge of GPU programming, model optimization techniques, and infrastructure cost management

  • Enterprise infrastructure: Experience with security, compliance, SSO, RBAC, and deploying into regulated or air-gapped environments

Leadership & Communication

  • Technical credibility: Earns respect through deep technical contribution, not just title or tenure

  • Clear communicator: Can explain complex technical decisions to customers, executives, researchers, and engineers at all levels

  • Strategic thinker: Balances short-term execution with long-term platform vision and architectural decisions

  • Player-coach mentality: Comfortable coding and debugging yourself while also managing, mentoring, and growing a team

  • High agency: Takes ownership of outcomes, doesn't wait for permission, and drives solutions to completion

Bonus Qualifications

  • Experience in industrial or scientific ML: Built infrastructure for physics simulation, computational chemistry, drug discovery, or other scientific computing domains

  • CAE/HPC background: Familiarity with simulation software, job schedulers (SLURM, PBS), parallel file systems, or high-performance computing environments

  • Founded or led platform teams at AI startups (Seed to Series B) through rapid growth and scaling challenges

  • Published or presented on ML infrastructure, distributed training, or MLOps topics at major conferences or venues

  • Experience with foundation models: Built infrastructure for training or serving large-scale pretrained models (LLMs, vision models, multimodal models)

  • Open-source contributions to major ML infrastructure projects (PyTorch, Ray, Kubernetes, MLflow, etc.)

  • PhD or MS in Computer Science, ML, or related field (or equivalent industry experience)

  • Enterprise B2B experience: Sold to or deployed infrastructure for Fortune 500 customers with complex security and compliance requirements

Cultural Fit

  • Technical Respect: Ability to earn respect through hands-on technical contribution, not just management authority

  • Intensity: Thrives in our unusually intense culture—willing to grind when needed and expects the same from your team

  • Customer Obsession: Passionate about solving real customer problems and building infrastructure that enables their success

  • Deep Work: Values long, uninterrupted periods of focused work and fosters this culture in your team

  • High Availability: Ready to be deeply involved whenever critical issues arise, whether that's at 2 AM or on weekends

  • Communication: Can translate complex technical concepts to diverse audiences and bridge engineering, research, and business

  • Growth Mindset: Embraces continuous learning and develops this mindset in your team

  • Startup Mindset: Comfortable with ambiguity, rapid change, and wearing multiple hats—you're a builder first, manager second

  • Work Ethic: Willing to put in the extra hours when needed to hit critical milestones and holds your team to high standards

  • Low Ego, High Accountability: Collaborative leadership style with focus on outcomes over personal credit

What We Offer

  • Build the foundation: Shape the ML platform strategy for a rapidly growing foundational AI company from the ground up

  • Real-world impact: See your infrastructure power physics simulations that optimize automotive aerodynamics, maritime vessel design, and other critical engineering applications

  • Direct CEO collaboration: Work closely with the founder & CEO, influence company strategy, and have your voice heard on major decisions

  • Exceptional team: Recruit and work with world-class deep learning researchers, CFD experts, and infrastructure engineers

  • Competitive compensation: Base salary + significant equity upside as a founding leadership hire

  • In-person culture: 5 days a week in office with a team that values face-to-face collaboration, deep technical discussions, and building together

  • World-class network: Access to our investors and advisors including Eric Schmidt, Elad Gil, Ion Stoica, David Patterson, and others

Benefits

  • Competitive compensation and equity

  • Competitive health, dental, vision benefits paid by the company

  • 401(k) plan offering

  • Flexible vacation

  • Team Building & Fun Activities

  • Great scope, ownership and impact

  • AI tools stipend

  • Monthly commute stipend

  • Monthly wellness / fitness stipend

  • Daily office lunch & dinner covered by the company

  • Immigration support

How We're Different

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again... who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly." - Teddy Roosevelt

At our core, we believe in being "in the arena." We are builders, problem solvers, and risk-takers who show up every day ready to put in the work: to sweat, to struggle, and to push past our limits. We know that real progress comes with missteps, iteration, and resilience. We embrace that journey fully knowing that daring greatly is the only way to create something truly meaningful.

If you're ready to build the ML platform that will revolutionize physics simulation, lead a world-class team, and deliver transformative impact to industrial engineering, UniversalAGI is the place for you.

Average salary estimate

$275000 / YEARLY (est.)
min
max
$200000K
$350000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Lead the design, training, and production deployment of ASR, TTS, and Speech LLM systems at OutcomesAI to power HIPAA-compliant voice agents in clinical settings.

Posted 22 hours ago

TENEX seeks a Principal AI Engineer in Sarasota, FL to architect and productionize AI-driven detection, investigation, and remediation systems for a next-generation MDR platform.

Photo of the Rise User
Posted 20 hours ago

Experienced Platform Engineer needed to design and optimize scalable backend systems and cloud infrastructure for a leading data orchestration platform (fully remote).

Photo of the Rise User

An experienced Java developer with AI/ML familiarity is needed to integrate and productionize machine learning capabilities within enterprise Java applications at a leading digital transformation consultancy.

Photo of the Rise User

Senior Staff Software Engineer to lead architecture and hands-on development of scalable .NET systems while mentoring engineers and shaping product direction.

Posted 22 hours ago

Observable Space is seeking an Embedded Software Engineer to design and maintain embedded Linux systems, drivers, and high-speed peripheral bring-up for next-generation ground and space telescopes in a hybrid Los Angeles role.

Photo of the Rise User
Posted 21 hours ago

Blossom Health, a Series A AI-native startup tackling the mental health crisis, is hiring Software Engineers in SoHo to build scalable, clinician-facing products and integrate modern AI capabilities.

Photo of the Rise User

GameChanger seeks an experienced Senior Backend Software Engineer to lead development and reliability improvements for its subscriptions platform, working remotely across the U.S. or from our Manhattan office.

Photo of the Rise User
Posted 1 hour ago

Lead and grow an engineering team at Iru to design and deliver scalable, secure core services for a modern AI-era security platform headquartered in Miami.

Jobspot Hybrid No location specified
Posted 22 hours ago

Innovative tech company seeks a Software Developer to design and deliver scalable, maintainable applications across front-end and/or back-end stacks.

Photo of the Rise User
Posted 22 hours ago

Build and lead the design and implementation of Palette Labs' core decentralized protocol primitives, smart contracts, and onchain/offchain integrations as the Founding Protocol Engineer.

Photo of the Rise User

Lead architecture and development of scalable full‑stack and cloud data systems for a clean‑energy software platform as a Staff Software Development Engineer.

Photo of the Rise User
Inclusive & Diverse
Empathetic
Take Risks
Transparent & Candid
Feedback Forward
Mission Driven
Collaboration over Competition
Work/Life Harmony
Maternity Leave
Paternity Leave
Snacks
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
401K Matching
Paid Sick Days
Paid Time-Off
Paid Volunteer Time

Lead a cross-functional squad at Spotify to design, build, and scale AI-driven personalization and agentic experiences for millions of listeners.

MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
HQ LOCATION
No info
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
December 1, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!