Member of Technical Staff — RL Research (New PhD Grad)
Posted
$250,000 – $350,000 USD
Tech Stack
Responsibilities
- Build Nuance's RL/post-training stack from 0→1, covering rollout generation, policy optimization, reward/reference model serving, data feedback loops, evaluation, checkpointing, observability, and debugging.
- Develop and scale post-training methods such as PPO, GRPO, DPO, rejection sampling, RLHF/RLAIF, online RL, and model-based data improvement.
- Design systems abstractions to connect research ideas to production-scale RL runs, including trainers, rollout workers, reward models, evaluators, data queues, experience buffers, and checkpoint promotion.
- Build evaluation and feedback loops for omni behavior, focusing on turn-taking, interruption, timing, emotional response, audiovisual coherence, instruction following, and real-time interaction quality.
- Optimize the end-to-end post-training loop across various metrics like rollout throughput, serving latency, GPU utilization, policy update efficiency, and research iteration speed.
Soft Skills
Software Engineering Fundamentals
Benefits
- Equity
- HSA/FSA
- Commuter Benefits
- 401k
- Visa Sponsorship
Culture
Maker ScheduleCross-Functional TeamsDeep Work FocusMission-DrivenStartup Energy
Requirements
Required: PhD in ML, RL, or a related field (completed or final stretch)
Regions: Us
Get jobs like this in your inbox
Weekly Reinforcement Learning, Post-Training Methods, Reward Modeling hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About Nuance Communications
Industry: ai
Size: startup
Nuance Labs is a Series A company building photorealistic, real-time AI avatars with emotional intelligence that can listen, speak, react, interrupt, and respond like a real person. They focus on developing foundation models for full-duplex audiovisual AI to achieve natural conversation with sub-500ms response times.
View company profile →Compensation
Base salary: $250,000 – $350,000 USD
Equity: meaningful equity; long-term ownership