Member of Technical Staff, Model Efficiency

Posted Nov 7, 2025

cohereNew Yorkfull-timestaff

Apply Now

Tech Stack

GitPythonRustTypeScript

Responsibilities

Work across the inference stack to improve core performance metrics by diving deep into model execution.
Identify bottlenecks and develop innovative optimizations.
Collaborate closely with modeling and systems teams to experiment, measure, and ship improvements that meaningfully accelerate inference.
Build expertise in advanced performance techniques, including GPU/CUDA optimizations, kernel-level improvements, and model execution strategies for MoE and large-scale architectures.

Benefits

401k
Gym Membership
Health Insurance
Learning Budget
Parental Leave
Remote Stipend
Remote Work

Culture

Cross-Functional TeamsMission-DrivenFast-PacedInclusive Hiring

Requirements

Regions: Us

About cohere

Industry: saas

Size: medium

Cohere is the leading security-first enterprise AI company, building cutting-edge foundation AI models and end-to-end products designed to solve real-world business problems.

View company profile →

cohere · San Francisco

Senior Member of Technical Staff, Multimodal AI

cohere · San Francisco

Member of Technical Staff, Search

cohere · United States

Member of Technical Staff, MLE

cohere · San Francisco

Member of Technical Staff — Model Optimization and Inference (Experienced)

Nuance Communications · Seattle, Washington

$250k