Member of Technical Staff (Data Scientist, Evals)

Posted

perplexitySan Franciscofull_timestaff

Tech Stack

Solid badges = required, outlined = preferred

Responsibilities

  • Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness.
  • Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality.
  • Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices.
  • Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements.
  • Operate within a small, high-impact team where evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality.

Soft Skills

Research

Culture

High Growth

Requirements

Required: PhD or MS in a technical field or equivalent experience
Regions: Us

Get jobs like this in your inbox

Weekly Python, SQL, AWS hiring trends and salary data — free.

Join 8 engineers getting weekly insights

Get market intelligence in your inbox

Free weekly insights on tech hiring trends, salaries, and in-demand stacks.

Already a subscriber? Sign in