At Docker, we make app development easier so developers can focus on what matters. Our remote-first team spans the globe, united by a passion for innovation and great developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker is the #1 tool for building, sharing, and running apps—trusted by startups and Fortune 100s alike. We’re growing fast and just getting started. Come join us for a whale of a ride!
We're seeking an experienced Director of Platform Infrastructure to lead critical aspects of Docker's infrastructure platform. In this role, you'll oversee the reliability, efficiency, and developer experience of foundational systems that power Docker's products and services. You'll lead teams responsible for site reliability engineering, cloud cost optimization, internal developer tooling, and core infrastructure components including our Envoy-based service mesh and networking layer.
This is a high-impact leadership role where you'll shape the technical strategy for platform infrastructure while building and mentoring world-class engineering teams. You'll work closely with engineering leaders across the organization to ensure Docker's infrastructure is reliable, cost-effective, and enables rapid product innovation.
Define and execute the technical vision and roadmap for platform infrastructure, balancing reliability, performance, cost, and developer velocity
Build, mentor, and retain high-performing teams across reliability engineering, FinOps, developer tooling, and foundational infrastructure
Partner with VP of Engineering and peer directors to align infrastructure investments with company priorities and product roadmaps
Establish engineering standards, best practices, and operational excellence across all platform infrastructure domains
Drive organizational culture of ownership, continuous improvement, and technical excellence
Own end-to-end reliability strategy for Docker's production infrastructure, including SLO/SLI frameworks, incident management, and on-call processes
Lead efforts to improve system observability, monitoring, and alerting across the platform
Drive post-incident review processes and ensure learnings translate into systemic improvements
Partner with product engineering teams to build reliability into application architecture
Balance investment between reliability improvements and feature delivery
Develop and execute FinOps strategy to optimize Docker's cloud infrastructure spend across AWS, Azure, and GCP
Implement cost visibility, attribution, and chargeback mechanisms across engineering teams
Identify and execute on cost optimization opportunities including resource rightsizing, commitment management, and architectural efficiency
Build culture of cost awareness across engineering organization while maintaining performance and reliability standards
Partner with finance and business teams on cloud budget planning and forecasting
Own the strategy and execution for internal developer platforms and tooling that accelerate engineering productivity
Lead development and maintenance of CI/CD pipelines, build systems, and deployment automation
Champion developer experience improvements across the engineering organization
Build self-service capabilities that enable product teams to operate independently
Measure and improve developer productivity metrics and engineering velocity
Oversee critical infrastructure components including Envoy-based service mesh, API gateways, and networking infrastructure
Drive adoption of modern infrastructure patterns including service mesh, zero-trust networking, and platform engineering
Ensure foundational components are scalable, secure, and well-documented
Partner with security and compliance teams to ensure infrastructure meets regulatory and security requirements
Balance technical debt reduction with new capability development
Work closely with product engineering, security, data platform, and business operations teams
Serve as infrastructure subject matter expert in architecture reviews and technical planning
Communicate infrastructure strategy, roadmaps, and constraints to technical and non-technical stakeholders
Build strong relationships with vendor partners and open source communities
10+ years of experience in infrastructure engineering, SRE, or platform engineering roles
5+ years of engineering management experience, including leading managers and directors
Deep technical expertise in cloud infrastructure (AWS, Azure, or GCP), container orchestration (Kubernetes), and distributed systems
Proven track record building and scaling reliable, high-performance infrastructure at significant scale
Strong understanding of service mesh technologies (Envoy, Istio, Linkerd) and modern networking concepts
Experience implementing SRE practices, SLO/SLI frameworks, and incident management processes
Demonstrated success in cloud cost optimization and FinOps practices
Experience building developer platforms and internal tooling that improve engineering productivity
Strong technical communication skills with ability to influence across all levels of the organization
Track record of building diverse, inclusive, high-performing teams
Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
Experience in container and cloud-native technologies
Background in both startup and large-scale enterprise environments
Experience with infrastructure as code (Terraform, Pulumi) and GitOps practices
Knowledge of observability tools and practices (Prometheus, Grafana, Datadog, etc.)
Contributions to open source infrastructure projects
Experience with multi-cloud and hybrid cloud architectures
Understanding of security, compliance, and regulatory requirements for infrastructure
Track record of technical writing and public speaking
At Docker, you'll have the opportunity to work on infrastructure that directly impacts millions of developers globally. Our engineering culture values technical excellence, ownership, and continuous learning. We believe in empowering our teams to make decisions, take calculated risks, and learn from failures. You'll work alongside talented engineers who are passionate about building robust, scalable systems that enable the future of software development.
We use Covey as part of our hiring and / or promotional process for jobs in NYC and certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 13, 2024.
Please see the independent bias audit report covering our use of Covey here.
Perks
Freedom & flexibility; fit your work around your life
Designated quarterly Whaleness Days
Home office setup; we want you comfortable while you work
16 weeks of paid Parental leave
Technology stipend equivalent to $100 net/month
PTO plan that encourages you to take time to do the things you enjoy
Quarterly, company-wide hackathons
Training stipend for conferences, courses and classes
Equity; we are a growing start-up and want all employees to have a share in the success of the company
Docker Swag
Medical benefits, retirement and holidays vary by country
Docker embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. The more inclusive we are, the better our company will be.
Due to the remote nature of this role, we are unable to provide visa sponsorship.
#LI-REMOTE
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
C&W Services is looking for a Reliability Engineer Leader to lead reliability programs, manage maintenance planners/schedulers, and drive equipment performance improvements at a fulfillment center in Hesperia, CA.
Applied Materials is hiring a Field Service Engineer III in Chandler, AZ to install, maintain, and troubleshoot semiconductor equipment while providing excellent customer support and on-site technical service.
Lead engineering and manufacturing at iota Biosciences to advance Class III implantable neurodevices from pivotal trials to early commercial scale.
The Engineering Liaison Office is hiring a Mechanical Designer to provide on-site NX/Teamcenter design support and rapid resolution of change requests for Columbia-class construction at Newport News Shipyard.
Experienced process technology leader wanted to document, harmonize, and advance Methocel manufacturing practices across Roquette's global network while supporting projects, validation, and site troubleshooting.
Sargent & Lundy is hiring an Electrical Designer 3 to support nuclear engineering projects with advanced MicroStation drafting and 3D design work in a hybrid New London role.
Atomic Semi is hiring a hands-on Packaging Engineering Intern to prototype, test, and iterate semiconductor packaging processes in their San Francisco lab.
AECOM is recruiting a Highway Engineer Intern to assist with roadway design, calculations, and construction plan drafting on transportation projects out of the Newark, DE office.
AbbVie seeks an onsite Automation Engineer in Worcester, MA to develop and maintain PLC/HMI/SCADA and MES-based control systems supporting drug product manufacturing and continuous improvement.
KBR MAGS is hiring a Radar Engineer in Huntsville to design and analyze radar systems, support spectrum and IFF certification, and provide technical leadership to DoD customers.
CesiumAstro is hiring a Propulsion / Turbomachinery Engineer II to lead CFD, FEA, design, and test work on turbomachinery for liquid rocket propulsion systems.
Hydraquip Electric Systems is seeking a hands-on Controls Engineer Summer Intern in Houston to assist with electrical and controls design, PLC work, CAD modeling and shop commissioning while gaining cross-functional experience.
Experienced wireless engineering leader needed to manage RAN performance acceptance, mentor engineers, and drive LTE/5G network optimization for Samsung's Wireless network team.
Docker is an open platform for developers and system administrators to build, ship and run distributed applications. They are based in Palo Alto, California.
4 jobs