Poshmark is a leading fashion resale marketplace powered by a vibrant, highly engaged community of buyers and sellers and real-time social experiences. Designed to make online selling fun, more social and easier than ever, Poshmark empowers its sellers to turn their closet into a thriving business and share their style with the world. Since its founding in 2011, Poshmark has grown its community to over 130 million users and generated over $10 billion in GMV, helping sellers realize billions in earnings, delighting buyers with deals and one-of-a-kind items, and building a more sustainable future for fashion. For more information, please visit www.poshmark.com, and for company news, visit newsroom.poshmark.com.
We’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.
6-Month Accomplishments
Familiarize with poshmark tech stack and functional requirements.
Get comfortable with automation tools/frameworks used within cloudops organization and deployment processes associated with.
Gain in depth knowledge related to related product functionality and infrastructure required for it.
Start Contributing by working on small to medium scale projects.
Understand and follow on call rotation as a secondary to get familiarized with the on call process.
12+ Month Accomplishments
Execute projects independently with little guidance from lead.
Create meaningful alerts and dashboards for various sub-system involved in targeted infrastructure.
Identify gaps in infrastructure and suggest improvements or work on it.
Get involved in on-call rotation.
Responsibilities
Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services.
Gain deep knowledge of our complex applications.
Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.
Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment.
Work closely with development teams to ensure that platforms are designed with operability in mind.
Function well in a fast-paced, rapidly-changing environment.
Participate in a 12x7 on-call rotation.
Desired Skills
3+ years of experience in Systems Engineering/Site Reliability Operations role is
required, ideally in a startup or fast-growing company.
3+ years in a UNIX-based large-scale web operations role.
3+ years of experience in doing 12/7 support for large scale production environments.
Battle-proven, real-life experience in running a large scale production operation.
Experience working on cloud-based infrastructure e.g AWS, GCP, Azure.
Hands-on experience with continuous integration tools such as Jenkins, configuration management with Ansible, systems monitoring and alerting with tools such as Nagios,New Relic, Graphite.
Experience scripting/coding
Ability to use a wide variety of open source technologies and tools.
Technologies we use:
Ruby, JavaScript, NodeJs, Tomcat, Nginx, HaProxy
MongoDB, RabbitMQ, Redis, ElasticSearch.
Amazon Web Services (EC2, RDS, CloudFront, S3, etc.)
Terraform, Packer, Jenkins, Datadog, Kubernetes, Docker, Ansible and other DevOps tools.
Please note that Poshmark will not be able to sponsor work-related visa for this position.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Senior software engineer role on Poshmark's Big Data team building scalable batch and real-time ETL pipelines and integrations to support growth analytics and ML workflows.
Lead the design and implementation of context and memory systems that allow MagicSchool's AI agents to provide accurate, context-aware assistance for teachers across lesson planning and assessment workflows.
Help shape the future of music creation by building elegant, high-performance web interfaces for Suno Studio used by artists and producers worldwide.
FlowGen Labs is hiring a hands-on Software Engineering Lead to architect and deliver scalable, secure cloud-native and AI-enabled platform services and enterprise integrations.
Experienced full-stack engineers with strong Python/Django and modern front-end (React) expertise are sought to build and maintain scalable, testable healthcare-grade software in a remote-first environment.
Lead Mercor’s production reliability efforts as the first Site Reliability Engineer, building SRE practices, operating high-availability systems, and partnering with infrastructure and AI teams in San Francisco.
Mindex seeks a remote DevOps Engineer to design CI/CD pipelines and automate cloud infrastructure supporting AI-powered enterprise platforms.
Western Union is hiring a Software Engineer II in Austin to build and maintain scalable, cloud-native microservices powering its global digital platform.
Mindex is hiring a Senior Salesforce Developer to lead development and integrations of custom Salesforce solutions in a fully remote role.
Lead architecture and implementation of Generative AI capabilities as a Principal Software Engineer at a mission-focused company building AI-driven products for social impact.
Lead the development of production AI and ML-driven education products for Harvard Business School while mentoring a cross-functional team of engineers and data scientists.
Mindex is hiring a remote Solution Lead (Salesforce) to lead Salesforce architecture and development, mentor engineers, and deliver integrated solutions for enterprise clients.
Lead the design and delivery of Insightly CRM pipelines and a scalable API/middleware layer to connect sales, operations, and partner systems for a fast-growing solar business.
MagicSchool is hiring a Staff Context Engineer to design and scale context, retrieval, and memory systems that power reliable, token-efficient AI agents for millions of teachers.
Poshmark is a leading fashion resale marketplace powered by a vibrant, highly engaged community of buyers and sellers and real-time social experiences.
3 jobs