Tech Stack
Responsibilities
- Design, build, and maintain automated model evaluation pipelines for candidate models, implementing objective and subjective quality metrics across STT, TTS, and STS products.
- Embed model quality checkpoints into CI/CD and release pipelines, defining pass/fail criteria and owning the go/no-go signal for production promotions.
- Stand up and operate evaluation tooling for end-to-end voice agent testing, covering accuracy, latency, turn-taking, conversational quality, and custom metrics.
- Partner with the Active Learning team to validate data ingestion infrastructure, annotation pipelines, and retraining automation.
- Automate execution and reporting of industry-standard benchmarks and maintain reproducible benchmark environments across multiple model versions.
Benefits
- 401k
- Flexible Hours
- Gym Membership
- Health Insurance
- Learning Budget
- Parental Leave
- Remote Stipend
- Unlimited PTO
Culture
AI-FirstFast-PacedExperimentationAdaptabilityContinuous Learning
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly AWS, Express, Git hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About Deepgram
Industry: ai
Size: small
Deepgram is the leading Voice AI platform providing real-time APIs for speech-to-text, text-to-speech, and building production-grade voice agents, trusted by over 200,000 developers and 1,300+ organizations. The company's voice-native foundation models offer unmatched accuracy, low latency, and cost efficiency, having processed over 50,000 years of audio.
View company profile →