This is a hybrid role based in our Chicago office and will require you to be in person Tuesdays and Thursdays.
About the Team:
The Site Reliability Engineering (SRE) team at Grindr is responsible for ensuring our systems are stable, performant, and scalable as we continue to grow globally. This role reports directly to the Director of Technical Operations and plays a critical part in keeping our infrastructure running reliably while supporting both backend and operations teams. By driving improvements in automation, incident response, and performance optimization, this position ensures Grindr can deliver a safe, reliable, and seamless experience to millions of users worldwide. The team’s work directly impacts uptime, efficiency, and overall system resilience, supporting Grindr’s broader roadmap of building a secure and high-performing platform for the LGBTQ+ community.
About the Job:
Monitoring and Alerting: Set up and maintain monitoring systems to track the health and performance of applications and infrastructure. Create and manage alerting mechanisms to detect and respond to issues quickly.
Incident Response: Handle incidents and outages, working to resolve them swiftly and minimize downtime. Performing root cause analysis to prevent future occurrences and improve system resilience.
Automation: Develop tools and scripts to automate repetitive tasks, such as deployments, monitoring, and scaling, to increase efficiency and reduce human error.
Performance Optimization: Analyze system performance and identify bottlenecks or areas for improvement. Work with development teams to optimize code and infrastructure for better performance and resource utilization.
Capacity Planning: Plan for future growth by analyzing current usage trends and forecasting resource needs. Additionally, you’ll ensure that systems can handle increased load without compromising performance or reliability.
Service Level Objectives (SLOs) and Service Level Agreements (SLAs): Define and measure SLOs and SLAs to set expectations for system reliability and performance. Track these metrics and work to maintain or exceed the defined standards.
Incident Management and Postmortems: After incidents, conduct post mortems to document what went wrong, what was done to fix it, and how to prevent similar incidents in the future. This process helps in continuous improvement and learning from failures.
Collaboration with Development Teams: Work closely with software developers to integrate reliability and performance into the development process. Provide guidance on best practices and assist with designing resilient systems.
Security and Compliance: Ensure that systems are secure and compliant with relevant regulations and standards. They implement security measures, monitor for vulnerabilities, and respond to security incidents.
Continuous Improvement: Continuously look for ways to improve system reliability, performance, and efficiency. Stay updated with industry trends and advancements to implement the best practices and technologies.
Participate in an on-call rotation
Role Requirements:
5+ years of experience in site reliability including incident response, incident management, automation and performance optimization
5+ years of experience in cloud platforms (AWS preferred)
4+ years of experience working with DevOps technologies such as Docker, Kubernetes, Helm, and Terraform
4+ years developing and maintaining CI/CD pipelines
4+ years experience using a scripting language like python or bash
Experience coding in Kotlin or another JVM language is a plus
You May Thrive in this Role if You:
Technical Expertise:
Proficient in at least one programming language (e.g., Python, Go, Java).
Strong knowledge of Linux/Unix systems.
Experience with cloud platforms (e.g., AWS, GCP, Azure).
Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
Understanding of networking concepts and protocols.
Reliability Engineering:
Experience with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK stack).
Ability to implement and manage CI/CD pipelines.
Knowledge of infrastructure as code (e.g., Terraform, Ansible).
Proficiency in automated testing and deployment practices.
Understanding of SRE principles and practices, including SLAs, SLOs, and SLIs.
Security:
Knowledge of security best practices and compliance standards.
Experience with vulnerability assessment and mitigation.
Operational Excellence:
Proven track record of maintaining high availability and performance in production environments.
Experience with incident management and post-mortem analysis.
Ability to optimize system performance and resource utilization
Benefits and Perks:
Mission and Impact: Grindr is building the global gayborhood in your pocket. Your role will impact the lives of millions of LGBTQ+ people around the world. Through our success, we are making a world where the lives of our community are free, equal, and just.
Family Insurance: Insurance premium coverage for health, dental, and vision for you and partial coverage for your dependents.
Retirement Savings: Generous 401K plan with 6% match and immediate vest in the U.S.
Compensation: Industry-competitive compensation and eligibility for company bonus and equity programs.
Queer-Inclusive Benefits: Industry-leading gender-affirming offerings with up to 90% cost coverage, access to Included Health, monthly stipends for HRT, and more.
Additional Benefits: Flexible vacation policy, monthly stipends for cell phone, internet, wellness, food, and commuting, breakfast/lunch provided onsite, and yearly travel & leisure stipend.
About Grindr:
Grindr is building the global gayborhood in your pocket. With more than 14.5 million monthly active users, Grindr has become a fundamental part of the LGBTQ+ community and is charting a path to make the world more free, equal, and just. In 2015 we introduced Grindr for Equality, our in-house non-profit which has advanced safety, health, and human rights for millions of Grindr users and the global LGBTQ+ community in partnership with more than 100 community organizations in every region of the world.
Our next evolution is underway as a public company that continues to grow and build meaningful experiences for our users. From social issues to product innovations, we’re setting audacious goals for our community and the business, and leveraging the latest tech stacks and a culture of engineering excellence to make it happen. At the heart of our work in this new chapter is a shared set of operating principles centered around cultivating curiosity, thinking big, exploring the depths of AI, setting and expediting our ambitious goals, and growing through iteration; all while keeping our users #1.
Grindr is headquartered in West Hollywood, California, with offices in the Bay Area, Chicago, and New York. With a track record of strong financial performance and plans for continued headcount growth, we aim to build a workforce of talented, passionate, and open-minded individuals from different backgrounds, with different abilities, identities, and mindsets. Come be a part of this exciting journey to disrupt the consumer technology space, innovate products, and advance LGBTQ+ culture!
Grindr is an equal-opportunity employer.
To learn more about how we handle the personal data of applicants, visit our Employee and Candidate Privacy Policy.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Lead the design and implementation of Grindr's data platform to accelerate data science and ML delivery with scalable, production-grade tooling and developer-first workflows.
Build scalable full‑stack solutions and AI/LLM integrations at StudyFetch, working in-person with the founding team to impact millions of learners.
Software Engineer I to build and maintain API-driven travel integrations at Flywire, contributing to partner onboarding and integration platform improvements.
Lead the design and implementation of internal AI agents on Basis's Atlas team to make the company agent-native and scale production-quality agentic systems for accounting.
Lead the design and implementation of scalable, graph-aware agentic AI systems that power JupiterOne's next-generation security coworker, Juno.
UMMC is hiring an Intermediate Epic Interface Analyst/Programmer to design, develop, and support Epic integrations and interface solutions on the Jackson main campus.
Lead and grow a world-class test engineering team driving automation, CI/CD, and quality for Palo Alto Networks' Next‑Gen Firewall and Cloud Security products.
Hayden AI is seeking a Staff Software Engineer to produce robust C++ edge applications and optimize ML/vision pipelines for real-time vehicle detection and tracking on Nvidia Jetson devices.
Lead .NET application refactoring and Azure cloud migration efforts to deliver secure, scalable solutions for federal clients at Aretum.
Opendoor is hiring a mid-level Software Engineer to productionize and operate ML pricing models, working closely with researchers to deliver reliable, production-grade systems.
Experienced Salesforce Architect needed to lead design and delivery of Lightning-based CRM solutions for federal clients, providing technical strategy, hands-on development, and team mentorship.
Strala, a seed-backed AI startup transforming insurance claims, seeks a Forward Deployed Engineer to build, deploy, and iterate production LLM and ML solutions in partnership with customers in San Francisco.
Experienced backend engineer sought to lead architecture and productionization of generative recommender and user/content representation systems for Spotify's Personalization team.
PlushCare is looking for a Senior Software Engineer (Database) to lead cloud database operations, automation, and infrastructure-as-code for scalable, resilient production systems.
Connect queer people with one another and the world.
5 jobs