Browse 44 exciting jobs hiring in Llm Testing now. Check out companies hiring such as Constructor, Jobgether, HackerOne in Kansas City, Mobile, Orlando.
Work remotely as a backend engineer building and maintaining partner integrations for Constructor’s scalable AI-powered e-commerce search platform (GMT-3 to GMT+3).
Velocity Black is hiring a Data Science Manager to lead development and production of scalable personalization and recommendation models that improve member experiences.
Lead design for HackerOne’s Pentest as a Service and AI Red Teaming products, owning research, interaction design, and the implementation of AI-driven workflows across a distributed product team.
Lead and scale a growth-focused data science and AI organization to build experimentation, personalization, and ML-driven products that unlock step-change user and revenue growth at Airwallex.
Lead high-impact, product-aligned experiments on foundation models using PyTorch and distributed training to improve real-world customer outcomes at Liquid AI.
GC AI is hiring a Growth Engineer to architect and ship growth systems that accelerate adoption of its legal AI platform through experimentation, analytics, and automation.
As a Data Scientist on OpenAI’s Platform team, you will define and operationalize metrics, run experiments, and deliver insights that drive developer adoption and enterprise value for the API and platform.
Horizon3.ai seeks an Applied AI Engineer to deploy and operationalize AI-driven offensive security capabilities that surface vulnerabilities across AI and traditional systems.
Help define and accelerate product-market fit for OpenAI’s Codex by measuring developer productivity, designing experiments, and informing model and product improvements.
Lead the engineering and applied-LLM work to improve agent reliability, autonomy, and evaluation pipelines for a fast-moving startup building autonomous business agents.
Work directly with the founders at an early-stage AI social startup to build agentic matchmaking, conversational agents, and the tooling and evals that make them reliable.
Lead and scale Socure’s growth engineering function by building AI-enabled GTM systems, growth experiments, and automation that drive measurable acquisition and revenue.
Humana seeks a Senior Penetration Tester to lead advanced application and cloud security assessments, drive custom exploit development, and translate technical findings into actionable business risk recommendations on a remote offensive security team.
Join Mistral AI as a Model Behavior Architect to shape LLM behavior through prompt design, evaluation pipelines, and policy work informed by humanities expertise.
AirOps is looking for a Product Engineer to design, prototype, and ship AI-powered product experiences in partnership with product, design, and engineering teams.
Amigo is hiring an Agent Engineer to build and verify production AI agents for regulated healthcare customers, combining strong Python engineering with domain-driven system design.
Lead product experimentation and LLM-powered evaluation to drive measurable product improvements across cross-functional teams.
ServiceNow is hiring a Machine Learning Engineer to build LLM-powered, production-ready AI features that simplify enterprise workflows and scale to thousands of customers.
Zillow's Agentic AI team is hiring a Machine Learning Engineer to design, train, evaluate, and ship agentic LLM solutions that improve user understanding and decision-making across the home search experience.
Lead customer deployments and governance of LLM-powered AI agents at Canvas Medical, translating technical capabilities into measurable healthcare automation outcomes.
Lead multiple R&D teams at Taboola to design and deliver scalable recommendation and LLM-powered systems that improve platform performance, user experience, and business outcomes.
At Delphi, work on cutting-edge prompt engineering to shape production LLM behavior and advance interactive digital minds used by thousands.
At Haystack News, a Senior Data Scientist will apply advanced statistical methods, causal inference, and machine learning to improve recommendations and measure product impact across our news streaming experience.
Help scale Yuma's AI-powered customer support platform as a Full‑stack Rails engineer with production LLM experience and a strong drive for rapid iteration.
Spotify is hiring a Machine Learning Engineer II to design and deploy large-scale ML and AI solutions for ad and podcast content categorization and brand-safety automation.
Lead and grow product design teams at Niche to deliver cohesive, AI-enabled user experiences that drive acquisition and revenue across multi-channel products.
Spotify is hiring a Staff Machine Learning Engineer to lead ML strategy and build large-scale generative and retrieval-based recommender systems for Search and personalization.
Lead the roadmap and delivery of user-facing AI capabilities at Cloaked to create transparent, controllable privacy-first experiences that drive activation, retention, and revenue.
Lead the development of production-ready machine learning and causal analytics to power personalization, experimentation, and optimization across NBCUniversal’s streaming products and ad experiences.
Lead the design and delivery of Connectly's enterprise-grade conversational AI platform, building agentic GenAI experiences, AI serving infrastructure on cloud, and integrations for large retailers.
Lead high-impact mixed-method research at Niche to shape AI-first product experiences that make school search more accessible, transparent, and equitable for millions of users.
Lead the strategy and product development of HackerOne's Pentest as a Service line, applying GenAI and platform integrations to scale enterprise offensive security.
Tarro is hiring a pragmatic Data Scientist to design, ship and maintain production ML pipelines and analytics that help small restaurants operate and grow.
Edia is hiring a data-driven Demand Generation Marketer to build and scale multi-channel demand programs that drive qualified pipeline for a fast-scaling AI ed-tech startup.
Ramp is hiring a Summer 2026 Applied Scientist Intern to develop and ship ML and LLM-based solutions that power underwriting, fraud detection, and smarter spend management.
15Five is hiring a remote Software Engineer to help design and build AI-powered, data-driven features for HR leaders while collaborating closely with product, design, and data science teams.
Lead design and production deployment of advanced ML and LLM models at Jerry.ai to scale customer-facing AI experiences and drive strategic growth from millions to tens of millions of users.
Lead the design and production deployment of complex ML solutions at Jerry.ai to drive large-scale user-facing products and strategic business initiatives.
Lead Instapage’s digital acquisition strategy and execution across multi-channel paid and organic channels to drive predictable, profitable growth for a market-leading PLG product.
Work on Constructor’s Searchandising team to design and deliver production-grade frontend features and tooling that help merchandisers optimize product discovery for global e‑commerce brands.
EdReports is hiring a Senior-level Software Engineer to deliver accessible, high-performance user-facing features and collaborate closely with product, design, and backend teams to advance our web platforms.
A 12-month AI Fellowship at the Gates Foundation to design, prototype, and deploy responsible AI solutions for global health and development while building capacity across program teams.
Launch Potato is hiring a Senior Machine Learning Engineer to build and scale recommendation systems that deliver real-time personalization across millions of users.
Lead AI-augmented QA efforts at Everlywell to build and maintain automated test suites, integrate AI tooling, and raise quality standards across web and backend systems.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
1
|