Tech Stack
Responsibilities
- Work across the inference stack to improve core performance metrics by diving deep into model execution.
- Identify bottlenecks and develop innovative optimizations.
- Collaborate closely with modeling and systems teams to experiment, measure, and ship improvements that meaningfully accelerate inference.
- Build expertise in advanced performance techniques, including GPU/CUDA optimizations, kernel-level improvements, and model execution strategies for MoE and large-scale architectures.
Benefits
- Remote Work
Culture
Cross-Functional TeamsMission-DrivenFast-PacedInclusive Hiring
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly Python, Rust, TypeScript hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About cohere
Industry: ai/ml
Size: medium
Cohere trains and deploys frontier models for developers and enterprises to build AI systems powering experiences like content generation, semantic search, RAG, and agents. They focus on scaling intelligence to serve humanity and believe their work is instrumental to AI adoption.
View company profile →