Member of Technical Staff (Data Scientist, Evals)

Posted Jun 29, 2026

perplexitySan Franciscofull_timestaff

Apply Now

Tech Stack

PythonSQLAWSDatabricksMachine LearningLlms

Solid badges = required, outlined = preferred

Responsibilities

Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness.
Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality.
Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices.
Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements.
Operate within a small, high-impact team where evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality.

Soft Skills

Research

Culture

High Growth

Requirements

Required: PhD or MS in a technical field or equivalent experience

Regions: Us

About perplexity

Industry: ai

Size: startup

Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and specialized data sources.

View company profile →

perplexity · San Francisco

Member of Technical Staff - Evaluations

Reflection AI · San Francisco

Member of Technical Staff (Secure Intelligence Institute)

perplexity · San Francisco

AIML - Data Scientist, Evaluation