Browse 18 exciting jobs hiring in Production Monitoring now. Check out companies hiring such as Jobgether, Brillio, Marqeta in Mobile, Chandler, Oxnard.
Lead the development and production deployment of scalable, high-impact ML systems as a Senior Staff Machine Learning Engineer at Flex (remote, US).
Brillio is hiring an L1 Production Support Engineer to provide first-line production monitoring and troubleshooting for iOS and Android mobile applications using CloudWatch, Dynatrace, S3 and related tools.
Marqeta is hiring a Staff Machine Learning Engineer to lead the design and delivery of scalable, compliant ML infrastructure that supports model development, deployment, and monitoring across the company.
Commure + Athelas is hiring an NYC-based Backend Software Engineer to help build and run production-grade revenue cycle management services using Python and cloud-native tooling.
LangSmith is seeking a Backend Engineer to build and optimize the backend services that power observability, tracing, and evaluation at scale for LangChain-powered applications.
Toyota Financial Services is hiring a Senior Production Engineer to own production reliability, incident response, and performance tuning for MuleSoft and other integration platforms in Plano, TX.
Lead the design and deployment of enterprise-grade MLOps, feature stores, and LLM-driven chatbot solutions at a fast-growing data product firm serving Fortune 500 clients.
Lead production support and risk management for RBC's US regional trading, client, and digital platforms, driving uptime, automation and strong stakeholder communication.
Senior Engineer, Production Operations to design and maintain Greenlight's cloud infrastructure, SRE practices, and automation for high-availability family fintech services.
Cotiviti is hiring a Senior Support Analyst to troubleshoot production issues, coordinate cross-functional resolution, and improve support processes in a remote capacity.
RBC is seeking a Senior Applications Support Analyst to lead incident resolution, application monitoring, and vendor coordination for critical production systems.
Experienced platform engineer needed to design, automate, and operate secure, production-grade cloud infrastructure and developer tooling for a large-scale cybersecurity environment.
Vetcove is hiring a Staff Machine Learning Engineer to own end-to-end production ML systems that drive forecasting, charge capture, and clinical insights for veterinary practices.
ServiceNow is hiring a Production Service Engineer to stabilize, automate, and optimize platform reliability across systems and services.
Senior-level MLOps engineer to lead development of scalable ML platforms and production ML systems across Azure-native cloud environments for a distributed, collaborative engineering team.
Robinhood seeks a Senior Software Engineer, Reliability in Menlo Park to design and operate large-scale, highly reliable systems and build centralized tooling to improve platform resilience and efficiency.
Catalant is hiring a Senior AI Engineer to design, build, and productionize generative AI systems (RAG, agents, evals, fine-tuning) that deliver scalable, high-quality capabilities across our platform and internal processes.
Lead reliability and observability for Saviynt's Federal platform, designing monitoring, alerting, and resiliency improvements across cloud environments to meet stringent SLA and customer needs.
Below 50k*
0
|
50k-100k*
1
|
Over 100k*
0
|