Job details

Software Engineer, Ray Data

About Anyscale:

At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, Cruise, and many more, have Ray in their tech stacks to accelerate the progress of AI applications out into the real world.

With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can scale an ML application from their laptop to the cluster without needing to be a distributed systems expert.

Proud to be backed by Andreessen Horowitz, NEA, and Addition with $250+ million raised to date.

About the role:

Ray aims to provide a universal API for building distributed applications (e.g. a machine learning pipeline of feature engineering, model training, and evaluation). Data is usually a core element connecting these different stages, and therefore plays a critical role in Ray’s usability, performance, and stability. We are looking for strong engineers to build, optimize, and scale Ray’s Datasets library and data processing capabilities in general.

About the Ray Data team:

The Ray Data team currently develops and maintains the Ray Datasets library, which is already powering critical production use cases (e.g. large scale data compaction at Amazon, and ML pipeline at Alibaba). Ray Datasets is a Python library built on top of Apache Arrow and Ray Core (Ray’s C++ backend), and the Ray Data team interacts closely with Ray Core components including the scheduler and the memory & I/O subsystems. The Ray Data team also works closely with Ray’s ML libraries including Train, RLlib, and Serve.

A snapshot of projects you will work on:

- Performance of Ray Datasets at large scale (leveraging Arrow primitives, optimizing Ray object manager, etc.)

- Integration with ML training and data sources

- Stability and stress testing infrastructure

- Lead future work integrating streaming workloads into Ray such as Beam on Ray

- Differentiate Data operations in Anyscale hosted Ray service

As part of this role, you will:

Develop high quality open source software to simplify distributed programming (Ray)
Identify, implement, and evaluate architectural improvements to Ray core and Datasets
Improve the testing process for Ray to make releases as smooth as possible
Communicate your work to a broader audience through talks, tutorials, and blog posts

We'd love to hear from you if have:

At least 5 years of relevant work experience
Solid background in algorithms, data structures, system design
Experience in building scalable and fault-tolerant distributed systems
Experience with data processing, database internals including Spark or Dask (streaming is a plus)

Anyscale Inc. is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law.

Anyscale Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish

Ray Ray Datasets Python Apache Arrow Distributed Systems Data Processing Spark Dask Streaming C++ Performance Engineering ML Pipelines

Average salary estimate

$195000 / YEARLY (est.)

min

max

$160000K

$230000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs

Full-stack Engineer (Java+JS)

South Geeks Hybrid Wyoming

VIEW

Posted 3 hours ago

Full-stack Engineer skilled in Java, Scala and modern JavaScript frameworks to build scalable backend services and responsive UIs for a biotech company advancing diagnostics and personalized medicine.

Sr. Software Engineer (Product Platform Zone, Backend Leaning)

Zapier Hybrid San Francisco

VIEW

Posted 18 hours ago

Inclusive & Diverse

Rise from Within

Mission Driven

Diversity of Opinions

Work/Life Harmony

Help architect and build Zapier’s first-generation SDK and developer platform, balancing scale, reliability, and excellent developer experience for internal and external users.

Entry Level Software Developer

Dayton Freight Hybrid Dayton

VIEW

Posted 20 hours ago

Entry-level Software Developer needed to help design, test, and maintain software for a stable transportation IT organization using modern web and backend technologies.

Staff Software Engineer, QuickPack (Remote)

Shippo Hybrid United States

VIEW

Posted 18 hours ago

Experienced engineering leader sought to architect and ship scalable, distributed APIs and systems as Staff Software Engineer on Shippo’s remote App team.

TRIRIGA Perceptive Application Developer (Remote - US)

Jobgether Hybrid No location specified

VIEW

Posted 1 hour ago

Experienced front-end developer needed to extend and customize IBM TRIRIGA Perceptive Applications using Google Polymer in a fully remote US role.

Senior Software Engineer

Awesome Motive Hybrid Remote

VIEW

Posted 8 hours ago

Cambium Assessment is hiring a Senior Software Engineer to design and build high-performance .NET systems that handle massive, time-sensitive testing data at scale.

Senior Software Engineer, Backend (.NET)

Cognitiv Hybrid Bellevue, WA

VIEW

Posted 11 hours ago

Cognitiv seeks a Senior Backend Software Engineer (.NET) to lead development and scale backend services for next-generation, pixel-less ad targeting systems from the Bellevue office on a hybrid schedule.

Developer – Developer Experience (Account Abstraction & Interop)

Ethereum Foundation Hybrid Remote, Boulder, Berlin

VIEW

Posted 9 hours ago

Help the Ethereum Foundation lower barriers to adoption for ERC-4337 and EIL by building developer tools, plugins, and multichain testing frameworks.

Staff Frontend Engineer, AI Applications

Jobgether Hybrid No location specified

VIEW

Posted 17 hours ago

Lead frontend architecture and deliver AI-enhanced, accessible web applications that make clinical data more actionable at a healthcare-focused software company.

Senior Backend Engineer (Ruby on Rails)

Scribd Hybrid No location specified

VIEW

Posted 5 hours ago

Scribd is hiring a Senior Backend Engineer to lead development and scaling of Ruby on Rails backend systems for the Growth team, focusing on performance, reliability, and business impact.

DevOps Engineer II

TAMUS Hybrid College Station, TX

VIEW

Posted 2 hours ago

Texas A&M AgriLife Research is hiring a DevOps Engineer II to implement and maintain cloud and hybrid infrastructure, focusing on automation, CI/CD, monitoring, and compliance for enterprise workloads.

Staff Software Engineer Test Automation (Prisma Access)

Palo Alto Networks Hybrid Santa Clara, CA

VIEW

Posted 8 hours ago

Palo Alto Networks is hiring a Principal/Staff Software Engineer in Test to develop and operate test automation and performance validation for the Prisma Access cloud security platform.

Senior System Software Engineer, Cloud Services - GeForce Now

NVIDIA Hybrid US, CA, Santa Clara

VIEW

Posted 2 hours ago

Customer-Centric

Mission Driven

Inclusive & Diverse

Rise from Within

Diversity of Opinions

Work/Life Harmony

Growth & Learning

Transparent & Candid

Medical Insurance

Paid Time-Off

Maternity Leave

Mental Health Resources

Equity

Child Care stipend

Paternity Leave

WFH Reimbursements

Flex-Friendly

Dental Insurance

Vision Insurance

Life insurance

Health Savings Account (HSA)

Flexible Spending Account (FSA)

401K Matching

Military leave

NVIDIA is hiring a Senior System Software Engineer to architect and build low-latency, highly available cloud services for GeForce NOW using modern cloud and backend technologies.