Paxos Health is looking for a part-time AI Prompt Engineer, likely 10-15 hours per week. This role is remote-friendly, and the hours are flexible. We are open to expanding this to a full-time role in the future pending your preferences and business needs as we grow.
We are a fast-growing company (with venture capital funding) that uses LLMs and AI agents to help medical patients fight their health insurance companies to get covered for live-saving and life-changing treatments. Our AI reads all of a patient’s medical documents to come up with the best possible argument about why an upcoming treatment is needed, and then it generates all the paperwork and follows up with the health insurer. We partner with medical device companies so our service is totally free to patients! See our website here.
Design high-precision, schema-guided prompts for structured extraction and complex decision logic over long documents, with careful management of versioned JSON schemas and downstream compatibility.
Build modular, multi-step LLM workflows by decomposing complex problems into clearly defined stages, managing intermediate state, and ensuring reliable, production-ready execution.
Debug and improve outputs by analyzing failure modes and implementing guardrails such as schema validation and retries, constrained decoding, deterministic tools or retrieval, and rule-based post-processing.
Maintain a versioned prompt library and contribute prompt patterns and best practices, including standardizing approaches that have proven effective via evaluation sets, A/B testing, and quality metrics.
Contribute to AI strategy: suggest changes to how we orchestrate and use AI to advance our business goals, particularly as we grow. We’re always open to new frameworks and techniques!
Required:
Hands-on experience writing and testing complex prompts for LLMs.
Understanding of prompt orchestration, including templating and variable injection; role design (system/user/tool); few-shot exemplar selection; structured JSON output enforcement (schemas, validators, retries); and tool-calling for retrieval and citation (e.g., linking outputs to specific document sections or PDF coordinates).
Ability to meet deadlines of a couple days when necessary. The hours are flexible, but sometimes we need relatively quick turnaround time for customers.
Ability to work independently. We will collaborate closely with you, but you should be able to execute on tasks and small projects on your own.
Relentlessly detail-oriented, with zero tolerance for silent hallucinations
Experience with prompt/eval tooling (e.g., Langfuse, Langflow, Vellum, PromptLayer, Helicone), with comfort evaluating per-criterion accuracy, traceability, and error tradeoffs.
Ability to overlap at least 3 hours with U.S. Eastern Time standard business hours.
Nice to have:
Experience using RAG technology for processing long documents.
Light scripting experience: Able to write small Python or TypeScript scripts to wire prompts into pipelines, run evals end-to-end, do basic post-processing, etc.
Healthcare industry experience.
Paxos Health is a company founded out of Stanford and backed by leading venture capital firms. The team has engineering and product management backgrounds in the medtech industry (Medtronic) and top technology companies (Meta and Microsoft), as well as founding experience across multiple startups and healthcare nonprofits. Haley and Alex both hold MBAs from Stanford University. Ben, our founding engineer, holds an M. Eng. from Cornell, previously built his own ML startup, and has many years of experience with building innovative software products.
We will help you build skills and connect you with our networks of startups and tech companies, many of whom also want to hire for similar roles. We care deeply about your career growth, and we’ll help you build any skills you need.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Experienced quantitative analyst needed to build and validate statistical lending models and analytical tools to strengthen WBL’s underwriting, pricing, and risk-management capabilities.
Highbeam is hiring a data scientist to build customer health scoring, portfolio-level intelligence, and actionable analytics that inform product, underwriting, and go-to-market decisions for e-commerce brands.
Lead the design, evaluation, and productionization of machine learning models and signal measurement frameworks to power PayPal's risk solutions and signal marketplace.
Whatnot is hiring a Product Data Scientist to lead experimentation, build self-serve analytics, and turn user and marketplace data into actionable product decisions.
Pear Accelerator is the best program for pre-seed and seed-stage founders to launch iconic companies from the ground up. We deliberately keep the program "small batch" to maximize the attention each founder gets from our partners. Our companies ...
19 jobs