Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Software Engineering – Inference Engineer image - Rise Careers
Job details

Software Engineering – Inference Engineer

About Virtue AI

Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated red-teaming, real-time multimodal guardrails, and systematic governance for enterprise apps and agents. Deploy in minutes—across any environment—to keep your AI protected and compliant. We are a well-funded, early-stage startup founded by industry veterans, and we're looking for passionate builders to join our core team.

What You’ll Do

As an Inference Engineer, you will own how models are served in production. Your job is to make inferences fast, stable, observable, and cost-efficient—even under unpredictable workloads.

You will:

  • Serve and optimize LLM, embedding, and other ML models' inference across multiple model families

  • Design and operate inference APIs with clear contracts, versioning, and backward compatibility

  • Build routing and load-balancing logic for inference traffic

    • Multi-model routing

    • Fallback and degradation strategies

    • vLLM or SGLang

  • Package inference services into production-ready Docker images

  • Implement logging and metrics for inference systems

    • Latency, throughput, token counts, GPU utilization

    • Prometheus-based metrics

  • Analyze server uptime and failure modes

    • GPU OOMs, hangs, slowdowns, fragmentation

    • Recovery and restart strategies

  • Design GPU and model placement strategies

    • Model sharding, replication, and batching

    • Tradeoffs between latency, cost, and availability

  • Work closely with backend, platform (Cloud, DevOps), and ML teams to align inference behavior with product requirements

What Makes You a Great Fit

You understand that inference is a systems problem, not just a model problem. You think in QPS, p99 latency, GPU memory, and failure domains.

Required Qualifications

  • Bachelor’s degree or higher in CS, CE, or related field

  • Strong experience serving LLMs and embedding models in production

  • Hands-on experience designing:

    • Inference APIs

    • Load balancing and routing logic

  • Experience with SGLang, vLLM, TensorRT, or similar inference frameworks

  • Strong understanding of GPU behavior

    • Memory limits, batching, fragmentation, utilization

  • Experience with:

    • Docker

    • Prometheus metrics

    • Structured logging

  • Ability to debug and fix real inference failures in production

  • Experience with autoscaling inference services

  • Familiarity with Kubernetes GPU scheduling

  • Experience supporting production systems with real SLAs

  • Proven ability to debug and fix inference failures in production

  • Comfortable operating in a fast-paced startup environment with high ownership

Preferred Qualifications

  • Experience with GPU-level optimization

    • Memory planning and reuse

    • Kernel launch efficiency

    • Reducing fragmentation and allocator overhead

  • Experience with kernel- or runtime-level optimization

    • CUDA kernels, Triton kernels, or custom ops

  • Experience with model-level inference optimization

    • Quantization (FP8 / INT8 / BF16)

    • KV-cache optimization

    • Speculative decoding or batching strategies

  • Experience pushing inference efficiency boundaries (latency, throughput, or cost)

Why Join Virtue AI

  • Competitive base salary compensation + equity commensurate with skills and experience.

  • Impact at scale – Help define the category of AI security and partner with Fortune 500 enterprises on their most strategic AI initiatives.

  • Work on the frontier – Engage with bleeding-edge AI/ML and deploy AI security solutions for use cases that don't yet exist anywhere else yet.

  • Collaborative culture – Join a team of builders, problem-solvers, and innovators who are mission-driven and collaborative.

  • Opportunity for growth – Shape not only our customer engagements, but also the processes and culture of an early lean team with plans for scale.

Equal Opportunity Employment

Virtue AI is an Equal Opportunity Employer. We welcome and celebrate diversity and are committed to creating an inclusive workplace for all employees. Employment decisions are made without regard to race, color, religion, sex, gender identity or expression, sexual orientation, marital status, national origin, ancestry, age, disability, medical condition, veteran status, or any other status protected by law.

We also provide reasonable accommodations for applicants and employees with disabilities or sincerely held religious beliefs, consistent with legal requirements.

Average salary estimate

$210000 / YEARLY (est.)
min
max
$170000K
$250000K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User

Virtue AI is hiring a Cloud/Platform Engineer to build one-click deployments and production-grade cloud infrastructure for secure, GPU-backed AI systems across AWS and GCP.

Photo of the Rise User
Posted 22 hours ago

Experienced Kotlin Multiplatform developer (contract) needed to drive cross-platform mobile features, implement best practices, and support a post-launch prediction markets platform for Chariot Solutions.

Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA is hiring a Senior Software Engineer on the Product Security team to design and integrate automated security tooling and pipelines across platform development environments.

Photo of the Rise User
Posted 14 hours ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Dare to be Different
Reward & Recognition
Fast-Paced
Maternity Leave
Paternity Leave
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Paid Holidays
Paid Sick Days
Paid Time-Off
Learning & Development
Social Gatherings

Experienced full-stack engineers with strong frontend and mobile skills are sought to build and ship core features for Robinhood’s Credit Card & Banking app across native mobile and backend systems.

Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA seeks a Senior Software Engineer (Networking - Cybersecurity) to architect and implement low-latency, secure network drivers and stacks for automotive and embedded platforms.

Photo of the Rise User

A remote Technical Engineer (API Platform) position focused on designing, building, and supporting enterprise API solutions across cross-functional teams.

Photo of the Rise User
Posted 22 hours ago

A fully remote WordPress Technical Lead role at Kanopi Studios to own architecture, lead development, mentor engineers, and deliver polished WordPress solutions for mission-driven clients across the US and Canada.

Alignerr is seeking a Senior C++ Full-Stack Engineer to build and optimize high-performance C++ systems and full-stack tooling for large-scale AI data annotation, validation, and evaluation workflows on a part-time remote contract.

Photo of the Rise User

Virtue AI is hiring a Cloud/Platform Engineer to build one-click deployments and production-grade cloud infrastructure for secure, GPU-backed AI systems across AWS and GCP.

Photo of the Rise User
StubHub Hybrid Los Angeles, California, United States
Posted 5 hours ago

Senior Software Engineer to help design and deliver scalable backend systems that power StubHub’s global marketplace and post-purchase operations.

Photo of the Rise User

BetterUp is hiring a Senior Site Reliability Engineer to scale and automate AWS/Kubernetes infrastructure while integrating AI-powered observability and incident response.

Photo of the Rise User
Posted 18 hours ago

Coupang is hiring a hands-on Director of Engineering, Security to lead secure architecture, identity, and AI/ML security practices for scalable backend platforms.

Photo of the Rise User

HPD Tech is hiring a Solutions Developer to design and implement scalable .NET and Azure-based enterprise solutions that support the agency's housing and neighborhood initiatives.

Photo of the Rise User
Customer-Centric
Mission Driven
Inclusive & Diverse
Rise from Within
Diversity of Opinions
Work/Life Harmony
Growth & Learning
Transparent & Candid
Medical Insurance
Paid Time-Off
Maternity Leave
Mental Health Resources
Equity
Child Care stipend
Paternity Leave
WFH Reimbursements
Flex-Friendly
Dental Insurance
Vision Insurance
Life insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
401K Matching
Military leave

NVIDIA is hiring a Senior System Software Engineer to design and implement performance-driven features and optimizations in the CUDA driver and runtime for high-performance GPU computing.

MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, onsite
DATE POSTED
December 22, 2025
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!