Dear applicants, please keep in mind that applications without provided salary expectations and active LN profile will not be considered. Hope for your understanding.
Location: San Francisco, CA (On-site)
Employment Type: Full-Time
Benefits: 100% medical, dental, vision; MacBook Pro + peripherals
We are hiring an AI Engineer to own the intelligence layer. This is not a demo or prototype role. This is production AI engineering.
About the Company
Client builds production AI agents that replace manual coordination work inside billion-dollar enterprises. Our agents operate at scale — processing thousands of transactions, making classification decisions, routing exceptions, and learning from human feedback.
We deploy intelligent agents directly into enterprise systems such as:
Salesforce
NetSuite
ServiceNow
Workday
You will:
Design and ship agentic systems used in real enterprise workflows
Build evaluation and reliability systems
Handle hallucinations, edge cases, cost constraints
Optimize multi-agent orchestration in production
You should already have built LLM-powered systems that operate beyond the playground stage.
What You’ll Own
Agent architecture design
Retrieval systems (RAG, context management)
Tool calling and multi-step reasoning
Multi-agent orchestration
Prompt engineering and reliability optimization
Evaluation and quality infrastructure
Cost-performance tradeoff optimization
Exception routing and human-in-the-loop feedback loops
Must-Have Requirements
3+ years software engineering experience
2+ years building production LLM or AI systems
Hands-on experience with agentic workflows
Experience with tool calling, retrieval, and multi-step reasoning
Strong prompt and context engineering skills
Experience building evaluation frameworks for AI outputs
Strong Python and backend fundamentals
Experience handling hallucinations, edge cases, and cost control
Based in San Francisco
Nice to Have
Experience integrating AI into enterprise SaaS systems
Experience with vector databases
Experience designing HITL systems
Experience with scaling AI workloads