Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Research Engineer (Infrastructure) image - Rise Careers
Job details

Research Engineer (Infrastructure)

Research Engineer, Infrastructure (RL & Numerics)

About the Role

As a Research Engineer at humans&, you'll design, build, and optimize the core systems that power large-scale reinforcement learning and model training. You'll work at the intersection of research and infrastructure, building the foundation that enables our researchers to push the boundaries of AI capabilities.

Your work will directly enable breakthroughs in AI by making experimentation and training fast, reliable, and scalable. This role is ideal for someone who blends deep systems expertise with curiosity for machine learning at scale—a builder who understands both the math of optimization and the realities of distributed compute.

We're hiring multiple engineers for this team.

What You'll Do

  • Design and build scalable infrastructure for reinforcement learning workloads

  • Optimize the numerical foundations of our distributed training stack, including precision formats, kernel optimizations, and communication frameworks

  • Improve the performance, stability, and reproducibility of training large models

  • Debug complex issues at the intersection of ML and systems—from diagnosing cluster failures to fixing regressions in data pipelines

  • Collaborate closely with researchers to accelerate experiments, develop new capabilities, and ensure every GPU cycle drives scientific progress

  • Build tools and abstractions that improve research velocity and enable our team to focus on science rather than system bottlenecks

What We're Looking For

  • Strong software engineering skills with the ability to write performant, maintainable code and debug complex codebases

  • Proficiency in Python and deep understanding of deep learning frameworks (PyTorch, JAX) and their underlying system architectures

  • Experience with distributed systems and large-scale computing

  • A bias for action—comfortable working across different stacks and teams to make sure things ship

Highly valued:

  • Experience building production training systems on many GPUs

  • Experience with reinforcement learning infrastructure and training pipelines

  • Background in floating-point numerics, low-precision arithmetic, and numerical optimization

  • Familiarity with distributed training frameworks (DeepSpeed, Megatron-LM, XLA) and cluster orchestration (Kubernetes, SLURM, Ray)

  • Track record of improving research productivity through infrastructure design

  • Contributions to open-source ML infrastructure

Average salary estimate

$235000 / YEARLY (est.)
min
max
$170000K
$300000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User

AbbVie is hiring an Area TA Lead, Neuroscience to provide strategic medical affairs leadership and drive evidence generation and affiliate support across Intercontinental markets from a hybrid Mettawa, IL location.

MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
No info
HQ LOCATION
No info
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
December 11, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!