Member of Technical Staff — RL Research (Experienced)
Posted
$300,000 USD
Tech Stack
Responsibilities
- Build Nuance’s RL/post-training stack from 0→1, including rollout generation, policy optimization, reward/reference model serving, data feedback loops, evaluation, checkpointing, observability, and debugging.
- Develop and scale post-training methods such as PPO, GRPO, DPO, rejection sampling, RLHF/RLAIF, online RL, and model-based data improvement.
- Design the systems abstractions that connect research ideas to production-scale RL runs: trainers, rollout workers, reward models, evaluators, data queues, experience buffers, and checkpoint promotion.
- Build evaluation and feedback loops for omni behavior: turn-taking, interruption, timing, emotional response, audiovisual coherence, instruction following, and real-time interaction quality.
- Optimize the end-to-end post-training loop across rollout throughput, serving latency, GPU utilization, policy update efficiency, queueing, checkpoint overhead, and research iteration speed.
Benefits
- 401k
- Equity
Culture
Deep Work FocusCross-Functional TeamsMission-DrivenCollaborative SpaceStartup Energy
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly React, Swift, TypeScript hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About Nuance Communications
Industry: ai
Size: startup
Nuance Labs is a Series A company building photorealistic, real-time AI avatars with emotional intelligence that can listen, speak, react, interrupt, and respond like a real person. They focus on developing foundation models for full-duplex audiovisual AI to achieve natural conversation with sub-500ms response times.
View company profile →Compensation
Base salary: $300,000 USD
Equity: Meaningful equity structured for long-term ownership