Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics
Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient resource utilization
Implement efficient low-level code (CUDA, Triton, custom kernels) and integrate it seamlessly into high-level frameworks
Optimize workloads for both throughput (batching, scheduling, quantization) and latency (caching, memory management, graph compilation)
Develop monitoring and debugging tools to guarantee reliability, determinism, and rapid diagnosis of regressions across both stacks
Deep experience in distributed systems, ML infrastructure, or high-performance serving (8+ years)
Production-grade expertise in Python, with strong background in systems languages (C++/Rust/Go)
Low-level performance mastery: CUDA, Triton, kernel optimization, quantization, memory and compute scheduling
Proven track record scaling inference workloads in both throughput-oriented cluster environments and latency-critical on-device deployments
System-level mindset with a history of tuning hardware–software interactions for maximum efficiency, throughput, and responsiveness
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead the design and deployment of advanced control, state estimation, and trajectory optimization systems for general-purpose robots, working closely with hardware and algorithm teams.
Lead the optimization and scaling of distributed training infrastructure for foundation models, improving wall-clock convergence by tuning data pipelines, kernels, and multi-node systems.
Arista is hiring a Senior Site Reliability Engineer to manage and scale the global CloudVision service fleet running on Kubernetes, ensuring reliability, observability, and automated operations.
Instructure is hiring a Software Engineer on the Professional Services team to build custom integrations, deliver customer-focused solutions, and mentor fellow engineers.
Smalls is hiring an Engineering Manager who will split time between hands-on engineering and team leadership to scale product and systems for a fast-growing DTC subscription business.
Work onsite with Reframe Systems' robotics and engineering team to develop software for robotic manipulation workcells and production automation in our Andover micro-factory during Summer 2026.
Demiurge Studios is hiring an Associate Software Engineer in Boston to implement, test, and iterate on game systems alongside cross-disciplinary teams for console, PC, and mobile projects.
Lead the optimization and scaling of distributed training infrastructure for foundation models, improving wall-clock convergence by tuning data pipelines, kernels, and multi-node systems.
Lead the reliability and observability efforts for a large IoT fleet at MLabs, improving device health through monitoring, tooling, and cross-functional collaboration.
Work on Ray Datasets to improve large-scale data processing, performance, and stability at Anyscale, contributing to an open-source platform used by teams running production ML workloads.
Help build and operate the core platform powering an AI-first enterprise SaaS at a fast-moving, venture-backed startup based in Midtown Manhattan.
Senior technical leader needed to steer PVA system software and DSP SDK development for NVIDIA’s Tegra mobile SoC platform, driving architecture, roadmap, and team execution.
NVIDIA is hiring a new college graduate AI Developer Technology Engineer to develop, optimize, and deploy high-performance deep learning solutions on GPUs while collaborating with research, architecture, and software teams.
Spalding, a Saalex Company, is hiring a Junior Software Engineer to support DoD-focused web and cloud modernization efforts with a hybrid schedule in Patuxent River, MD.
Motorola Solutions is hiring a Summer 2026 Audio Software Engineering Intern to help prototype and evaluate mission-critical audio algorithms and lab automation in a hybrid Plantation, FL role.