Twenty is seeking a Senior Data Engineer for an in-office position in its Arlington, VA office to architect and lead the development of data infrastructure that powers our cyber operations applications and capabilities. We're looking for someone with 8+ years of experience in data engineering and architecture, with mastery-level expertise in ETL pipeline development, data lake architecture, and schema design for complex datasets, plus proven leadership experience mentoring engineers and driving technical initiatives. In this role, you'll architect scalable data lakes that aggregate cyber operation data from diverse sources, lead the design of sophisticated graph database schemas that capture relationships across cyber networks, personas, physical world entities, and electromagnetic spectrum data, establish best practices for AI/ML data preparation workflows, and mentor junior data engineers while driving technical excellence across the data platform. You'll join a world-class product and engineering team that delivers mission-critical solutions for U.S. national security, working in both cloud and on-premises environments to build data infrastructure that operates at machine speed. If you're passionate about solving complex data architecture challenges while leading technical initiatives and making a direct impact on national security, we want to talk to you.
At Twenty, we're taking on one of the most critical challenges of our time: defending democracies in the digital age. We develop revolutionary technologies that operate at the intersection of cyber and electromagnetic domains, where the speed of operations exceeds human sensing and complexity transcends conventional boundaries. Our team doesn't just solve problems – we deliver game-changing outcomes that directly impact national security. We're pragmatic optimists who understand that while our mission of protecting America and its allies is challenging, success is possible.
Lead the design and architecture of enterprise-scale data platforms that support mission-critical cyber operations
Define technical vision and roadmap for data infrastructure, balancing operational needs with scalability and performance
Evaluate and recommend engineering courses of action for data platform enhancements and technology adoption
Drive technical decision-making for complex data architecture challenges across multiple systems and teams
Establish data engineering standards, best practices, and design patterns across the organization
Lead architecture review sessions and provide technical guidance on data-intensive projects
Architect and implement highly scalable, multi-petabyte data lake solutions on AWS to support the our applications and cyber operations workflows
Design sophisticated data ingestion frameworks that collect and consolidate data from diverse sources including network traffic, threat intelligence feeds, sensor data, operational logs, and electromagnetic spectrum monitoring
Implement advanced data partitioning, compression, and storage optimization strategies to enable sub-second querying across massive datasets
Establish comprehensive data governance frameworks including data lineage, cataloging, metadata management, and quality monitoring
Design data mesh architectures that enable domain-oriented data ownership while maintaining central governance
Lead capacity planning and cost optimization efforts for petabyte-scale data infrastructure
Architect robust, fault-tolerant ETL pipelines that transform raw cyber operation data into structured formats for analysis at scale
Design complex data validation and quality assurance frameworks that ensure data integrity throughout multi-stage pipelines
Implement hybrid streaming and batch processing architectures to handle both real-time operational data and historical analysis
Lead development of sophisticated data enrichment processes that augment datasets with threat intelligence, geolocation data, temporal context, and multi-domain correlations
Design self-healing pipeline architectures with comprehensive error handling, retry logic, and monitoring systems
Drive performance optimization initiatives achieving significant improvements in throughput and latency
Lead the design and implementation of enterprise-grade graph database schemas using graph databases that model complex relationships across multiple operational domains:
Cyber network infrastructure, communication patterns, and exploitation chains
Cyber personas, attribution chains, and threat actor relationships
Physical world entities, geospatial relationships, and cross-domain connections
Electromagnetic spectrum data including WiFi, IoT protocols, and RF signal patterns
Architect sophisticated data models that balance query performance, analytical flexibility, and storage efficiency
Design schema evolution strategies and implement zero-downtime migration frameworks
Develop comprehensive ontologies and taxonomies that enable consistent data representation across diverse intelligence datasets
Lead graph query optimization efforts and establish indexing strategies for complex traversal patterns
Mentor engineers on graph modeling best practices and advanced Cypher query techniques
Partner closely with backend engineers, data scientists, cyber operations experts, and forward deployed analysts to understand evolving data requirements
Work with SRE teams to ensure data infrastructure reliability and operational excellence in secure government environments
Engage with government stakeholders to translate operational needs into technical data solutions
Provide technical leadership in customer engagements and capability demonstrations
Contribute to long-term technical strategy and roadmap planning for data platforms
8+ years of professional experience in data engineering, data architecture, or related roles with increasing technical leadership
Expert-level proficiency with AWS data services including S3, Athena, Kinesis, Lake Formation, and Redshift
Proven experience leading technical projects and mentoring junior data engineers
Advanced programming skills in Python and/or Golang for building production-grade data pipelines and frameworks
Extensive experience designing and implementing complex ETL pipelines using Apache Airflow or similar orchestration frameworks at scale
Deep expertise in data modeling techniques for both relational and NoSQL databases, including dimensional modeling and data vault methodologies
Mastery-level experience with graph databases (Neo4j, AWS Neptune, or similar) including advanced schema design, query optimization, and performance tuning using Cypher or Gremlin
Expert knowledge of big data processing frameworks such as Apache Spark, Flink, or similar technologies for petabyte-scale processing
Advanced SQL skills and proven experience with query optimization, indexing strategies, and performance tuning for massive datasets
Extensive experience with data lake architectures, modern data warehouse solutions (Snowflake, Redshift, Databricks), and lakehouse patterns
Deep understanding of data serialization formats (Parquet, Avro, ORC, JSON, Protocol Buffers) and optimization techniques
Expert knowledge of streaming data architectures, event-driven processing, and CDC (Change Data Capture) patterns
Advanced experience with containerization (Docker, Kubernetes) and infrastructure as code (Terraform, CloudFormation)
Understanding of distributed systems principles, consensus algorithms, and fault tolerance patterns
Deep understanding of data security best practices including encryption at rest and in transit, key management, and secure data sharing
Extensive experience implementing role-based access controls, data classification schemes, and audit logging
Knowledge of data privacy principles, compliance requirements for government systems, and secure data handling in classified environments
Understanding of zero-trust architecture principles and secure data pipeline design
Demonstrated experience mentoring data engineers and leading technical teams
Proven ability to organize development workflows, manage project delivery, and coordinate cross-functional initiatives
Strong communication skills with ability to explain complex technical concepts to diverse audiences including executives and government stakeholders
Experience conducting thorough code reviews and establishing data engineering standards
Track record of driving technical decision-making and architectural improvements
Bachelor's degree in Computer Science, Data Science, Information Systems, or related field; Master's degree preferred
Equivalent practical experience in lieu of formal education may be considered for exceptional candidates
Must be eligible to obtain a U.S. Government security clearance
Ability to work on-site in Arlington, VA with occasional travel to Fort Meade, MD
Previous experience as a technical lead or senior data engineer in government, defense, or intelligence applications
Track record of building data infrastructure for mission-critical systems with high availability requirements
Background in cybersecurity data analysis, threat intelligence platforms, or SIEM systems
Expert knowledge of graph analytics and network analysis for cyber operations and threat hunting
Deep understanding of cyber domain data including network flows, DNS logs, PCAP analysis, threat feeds, and vulnerability data
Expertise in geospatial data processing, analysis, and visualization
Experience with data mesh architectures, federated data systems, or multi-tenant data platforms
AWS certifications (Data Analytics, Solutions Architect, or similar) or other relevant data engineering certifications
Previous experience working with data scientists and ML engineers building production AI systems
Publications, conference talks, or recognized contributions to the data engineering community
Expert knowledge of DataOps practices and CI/CD for data pipelines
Advanced understanding of cost optimization strategies for cloud data infrastructure at scale
Experience with data visualization platforms and building executive-level analytics dashboards
Deep knowledge of message queue systems (NATS, Kafka, RabbitMQ, Amazon SQS/SNS)
Experience designing APIs for data access and integration (REST, GraphQL, gRPC)
Understanding of multi-cloud or hybrid cloud data architectures
Expertise with data observability platforms and lineage tracking tools (Monte Carlo, Datadog, Great Expectations)
Experience with column-oriented databases and analytical query engines
Knowledge of data compression algorithms and storage optimization techniques
Experience with real-time analytics and complex event processing systems
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Stride seeks a remote, 1099 STAGES Sign Language Interpreter fluent in ASL to deliver virtual educational interpreting and IEP-based supports for D/HH students across U.S. schools.
Lead the Customer Support team in Santa Rosa to ensure exceptional service, compliance, and efficient Title Fulfillment for Vehicle Records and Digital Titling products.
Cardinal Health is looking for a Senior Data Engineering Analyst to develop and optimize GCP-based data pipelines (Cloud Composer/Airflow, Dataflow, BigQuery) and drive reliable, cost-effective data delivery for analytics.
Philadelphia Gas Works is hiring a GIS Analyst to design, manage, and automate ESRI ArcGIS Enterprise GIS solutions that support utility asset management and business workflows.
Phoenix American is hiring a detail-focused Data Steward to research, validate and resolve fund and rep-level data issues feeding into our MARS platform.
NV5 is hiring a LiDAR Calibration Analyst in Portland to perform calibration, QA/QC, and validation of airborne LiDAR swaths to meet internal and client accuracy standards.
Work as a Blockchain Data Solutions Engineer to translate customer needs into actionable onchain data solutions, demonstrating analytics tools and guiding integrations across multiple chains.
Senior Data Engineer needed to design and build a scalable, multi-tenant data platform that powers Verge's CONVERGE dataset and customer-facing SaaS analytics.
Oscar Health is seeking a hands-on Director to architect and build a next-generation medical economics reporting and analytics suite that drives Total Cost of Care insights and decision-making.
A growing tech services firm is looking for a Data Engineer experienced in Azure Synapse and large-scale data migrations to modernize ETL pipelines and support data warehouse initiatives.
Experienced data engineer needed to build and optimize scalable cloud data pipelines, improve CI/CD, and mentor junior engineers for a fast-moving analytics organization.
CapTech seeks a Lead Data Architect to define and deliver cloud-first data architectures (AWS, Azure, GCP), lead engineering teams, and drive data strategy and governance for enterprise clients.
A US-remote Senior Data Engineer I is needed to build and optimize scalable data pipelines, lead technical best practices, and mentor junior engineers to drive analytics and platform reliability.
NeoWork is hiring a meticulous remote Data Evaluator to validate SaaS-based data, create structured JSON records, and ensure high-quality outcomes for client-facing datasets.
Shelter is hiring a Data Engineering & Operations Specialist to build and operate Snowflake-based data pipelines and integration solutions using Qlik Replicate/Compose, DBT Labs, and Astronomer at its Columbia corporate office.
SpringRole is the first professional reputation network powered by artificial intelligence and blockchain to eliminate fraud from user profiles. Because SpringRole is built on blockchain and uses smart contracts, it's able to verify work experienc...
457 jobs