Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality. Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.
Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.
Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching and implementing techniques to drive performance and quality optimizations across the latest AI models. You will implement techniques such as quantization, KV caching, and FlashAttention to enable inference efficiency. You will design parallelism strategies to distribute data and workloads across compute nodes at production scale. You will dive deep into GPU code and kernel optimizations to accelerate AI workloads.
Responsibilities:
Evaluating and implementing cutting-edge AI research for model performance and efficiency
Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers
Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems
Qualifications:
Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study
Experience with performance optimization
Preferred Qualifications:
Graduate degree in computer science, engineering, applied mathematics or comparable area of study
Familiarity with compilers and compiler frameworks such as MLIR
Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks
Software development experience with Python, C++, and CUDA
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Workday is hiring a Software Development Engineer Intern to build real-world features, learn from experienced engineers, and contribute to platform engineering during a 12-week in-person summer program in Pleasanton, CA.
Lead architecture and development of Clair's real-time fintech systems, building scalable, secure APIs and transaction processing that deliver instant pay to millions of users.
National General seeks a hands-on full-stack Software Engineer Consultant I to build and maintain cloud-ready, test-driven applications for insurance products in a remote role.
Lead a multidisciplinary DevOps/SRE team to build and operate scalable, GitHub-first CI/CD and multi-cloud GPU inference infrastructure for NVIDIA's AI products.
Senior-level embedded engineer needed to develop and validate bare-metal infrastructure, board bring-up, and secure drivers for ARM-based SoC platforms supporting critical defense systems.
Lead development of photometric display calibration algorithms and on-device tools for Anduril's AR/VR systems, applying deep expertise in C/C++, computer vision, and display metrology to production environments.
Lead the development and deployment of pickup/dropoff motion-planning algorithms at Zoox to improve robotaxi driving behavior across complex real-world scenarios.
Bjak is looking for an experienced ML Ops Engineer to optimize, serve, and scale open-source LLMs into production for high-impact global AI products in a hybrid remote/New York role.
Proto Labs is hiring a Senior Software Engineer (Contractor) to lead AX 2012 → D365 F&O upgrades, build integrations, and modernize finance/order systems for their Maple Plain, MN operations.
Lead operational excellence for federal cloud systems as a Senior Azure Engineer at GDIT, focusing on Azure monitoring, IaC automation, incident response, and cost optimization.
Boeing is hiring an Associate Software Systems Engineer to support systems engineering, requirements, and Agile delivery for space-focused software and distributed computing solutions in Herndon, VA.
Lead Gravie's engineering efforts on customer and member web portals as a hands-on technical lead focused on full-stack development, reliability, and team growth.
ENS Solutions is hiring a senior IDAM-focused Software Engineer to design, integrate, and support enterprise identity and access management systems for DoD/IC environments under an active TS/SCI clearance with CI poly.