Lead Member of Technical Staff, Inference Infrastructure
Posted
Tech Stack
Responsibilities
- Lead the design and architecture of high-performance, scalable, and reliable machine learning systems for Cohere's AI platform.
- Drive the strategy for deploying optimized NLP models to production in low latency, high throughput, and high availability environments.
- Serve as a key point of contact for customers, leading the design of customized deployments to meet specific needs.
- Mentor engineers to raise the technical bar across the Model Serving team.
- Own compute/storage/network resource and cost management at an organizational level, including optimization strategies.
Benefits
- Remote Work
Culture
Open And Inclusive CultureWork-Life BalanceFast-PacedCross-Functional TeamsMentorship Program
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly AWS, Azure, GCP hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About cohere
Industry: ai/ml
Size: medium
Cohere trains and deploys frontier models for developers and enterprises to build AI systems powering experiences like content generation, semantic search, RAG, and agents. They focus on scaling intelligence to serve humanity and believe their work is instrumental to AI adoption.
View company profile →