About Anyscale:
At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, Cruise, and many more, have Ray in their tech stacks to accelerate the progress of AI applications out into the real world.
With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can scale an ML application from their laptop to the cluster without needing to be a distributed systems expert.
Proud to be backed by Andreessen Horowitz, NEA, and Addition with $250+ million raised to date.
About the role:
Ray aims to provide a universal API for building distributed applications (e.g. a machine learning pipeline of feature engineering, model training, and evaluation). Data is usually a core element connecting these different stages, and therefore plays a critical role in Ray’s usability, performance, and stability. We are looking for strong engineers to build, optimize, and scale Ray’s Datasets library and data processing capabilities in general.
About the Ray Data team:
The Ray Data team currently develops and maintains the Ray Datasets library, which is already powering critical production use cases (e.g. large scale data compaction at Amazon, and ML pipeline at Alibaba). Ray Datasets is a Python library built on top of Apache Arrow and Ray Core (Ray’s C++ backend), and the Ray Data team interacts closely with Ray Core components including the scheduler and the memory & I/O subsystems. The Ray Data team also works closely with Ray’s ML libraries including Train, RLlib, and Serve.
A snapshot of projects you will work on:
- Performance of Ray Datasets at large scale (leveraging Arrow primitives, optimizing Ray object manager, etc.)
- Integration with ML training and data sources
- Stability and stress testing infrastructure
- Lead future work integrating streaming workloads into Ray such as Beam on Ray
- Differentiate Data operations in Anyscale hosted Ray service
As part of this role, you will:
Develop high quality open source software to simplify distributed programming (Ray)
Identify, implement, and evaluate architectural improvements to Ray core and Datasets
Improve the testing process for Ray to make releases as smooth as possible
Communicate your work to a broader audience through talks, tutorials, and blog posts
We'd love to hear from you if have:
At least 5 years of relevant work experience
Solid background in algorithms, data structures, system design
Experience in building scalable and fault-tolerant distributed systems
Experience with data processing, database internals including Spark or Dask (streaming is a plus)
Anyscale Inc. is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law.
Anyscale Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Full-stack Engineer skilled in Java, Scala and modern JavaScript frameworks to build scalable backend services and responsive UIs for a biotech company advancing diagnostics and personalized medicine.
Help architect and build Zapier’s first-generation SDK and developer platform, balancing scale, reliability, and excellent developer experience for internal and external users.
Entry-level Software Developer needed to help design, test, and maintain software for a stable transportation IT organization using modern web and backend technologies.
Experienced engineering leader sought to architect and ship scalable, distributed APIs and systems as Staff Software Engineer on Shippo’s remote App team.
Experienced front-end developer needed to extend and customize IBM TRIRIGA Perceptive Applications using Google Polymer in a fully remote US role.
Cambium Assessment is hiring a Senior Software Engineer to design and build high-performance .NET systems that handle massive, time-sensitive testing data at scale.
Cognitiv seeks a Senior Backend Software Engineer (.NET) to lead development and scale backend services for next-generation, pixel-less ad targeting systems from the Bellevue office on a hybrid schedule.
Help the Ethereum Foundation lower barriers to adoption for ERC-4337 and EIL by building developer tools, plugins, and multichain testing frameworks.
Lead frontend architecture and deliver AI-enhanced, accessible web applications that make clinical data more actionable at a healthcare-focused software company.
Scribd is hiring a Senior Backend Engineer to lead development and scaling of Ruby on Rails backend systems for the Growth team, focusing on performance, reliability, and business impact.
Texas A&M AgriLife Research is hiring a DevOps Engineer II to implement and maintain cloud and hybrid infrastructure, focusing on automation, CI/CD, monitoring, and compliance for enterprise workloads.
Palo Alto Networks is hiring a Principal/Staff Software Engineer in Test to develop and operate test automation and performance validation for the Prisma Access cloud security platform.
NVIDIA is hiring a Senior System Software Engineer to architect and build low-latency, highly available cloud services for GeForce NOW using modern cloud and backend technologies.
We are building the future of software development.
2 jobs