About Anyscale:
At Anyscale, we're on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. We’re commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, Cruise, and many more, have Ray in their tech stacks to accelerate the progress of AI applications out into the real world.
With Anyscale, we’re building the best place to run Ray, so that any developer or data scientist can scale an ML application from their laptop to the cluster without needing to be a distributed systems expert.
Proud to be backed by Andreessen Horowitz, NEA, and Addition with $250+ million raised to date.
About the role:
We are looking for a Test Automation Software Engineer to join our Platform Infrastructure team. You’ll play a key role in ensuring the quality, reliability, and scalability of Anyscale’s distributed systems and developer platform.
In this role, you will design and build automated testing frameworks, integration systems, and test pipelines that validate the performance and resilience of our platform — from control plane orchestration to the execution of distributed AI workloads on Ray.
You will collaborate closely with infrastructure, cloud platform, and Ray open-source teams to ensure our systems meet the high reliability standards required for production-grade AI workloads.
Design and develop automated test frameworks for large-scale, distributed systems
Build end-to-end, performance, and integration tests for control and data plane components
Implement test pipelines to improve release velocity and reliability
Develop tools for cluster-level simulation, fault injection, and stress testing of Ray workloads
Collaborate with product and infrastructure engineers to ensure new features meet quality and scalability goals
Drive improvements in test coverage, observability, and reliability metrics
Support and optimize automated testing in Kubernetes, AWS, GCP, and Azure environments
Contribute to open-source Ray testing infrastructure and internal infinite laptop product quality
Participate in design and architecture discussions, code reviews, and system debugging
Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
3+ years of experience in software test automation or infrastructure engineering
Strong coding skills in Python (Go experience a plus)
Experience building test frameworks or automation for distributed systems, microservices, or cloud-native applications
Familiarity with Kubernetes, Docker, and CI/CD systems (GitHub Actions, Jenkins, Buildkite, or similar)
Proficiency with cloud platforms (AWS, GCP, or Azure)
Knowledge of observability tools such as Prometheus, Grafana, or OpenTelemetry
Understanding of networking, security, and system reliability concepts
A passion for improving developer experience, product stability, and system performance
Collaborate with top engineers in distributed systems and AI infrastructure
Work on technology used by leading AI companies around the world
Meaningful impact on both open-source and enterprise-grade systems
Competitive compensation, equity, and benefits package
Anyscale Inc. is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law.
Anyscale Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Anyscale is looking for a hands-on Sales Enablement Manager to design and scale enablement programs that help a fast-growing B2B SaaS sales team win enterprise deals.
Robinhood is hiring Android Engineers to build consumer-facing Kotlin apps with Jetpack Compose and modern Android patterns to help scale the company's mobile investing experience.
College Board is seeking a Senior Full Stack Engineer (UI Focus) to lead front-end development and mentor teammates while building scalable, student-facing features for BigFuture.org in a fully remote role.
Rackner is hiring a remote Kubernetes Engineer (weekday shift) to secure, scale, and automate multi-cloud Kubernetes platforms that support mission-critical DoD communications.
Fora is hiring a Senior/Staff Full Stack Engineer to design and deliver advisor payments, reconciliation, and payout systems across frontend and backend services.
Join Comulate’s engineering team in San Francisco to own end-to-end systems and drive zero-to-one product work that automates insurance back-office operations.
Lead and shape the infrastructure powering GenBio AI's large biological models, focusing on Kubernetes GPU orchestration, MLOps pipelines, security, and cross-team operational excellence.
Technical Lead, Dev Ops at Benchmark Education leading Agile software teams to build and scale literacy-focused learning products and backend services.
Commure is hiring an onsite Full Stack Software Engineer in Mountain View to build and scale its AI medical scribe and accelerate user growth.
Vorticity is hiring a Full Stack Software Engineer to build and optimize scientific computing platforms that run on CPUs, GPUs and novel hardware.
UpGuard is hiring a GTM Automation Engineer to design and deliver AI-first automations and integrations that accelerate Sales and Customer Success operations.
PAR Technology is hiring a Senior Software Engineer (Elixir/OTP) to design and deliver scalable, cloud-based backend systems for POS solutions used by convenience stores worldwide.
Hoplynk is hiring a Full Stack Engineer to build HYDRA’s operator console and APIs, converting realtime telemetry and routing data into intuitive, low-code monitoring and configuration workflows.
Join NVIDIA's SONiC team to design and deliver networking and chassis-management features for next-generation high-speed switch solutions using C++, Python, and open-source workflows.
We are building the future of software development.
5 jobs