Member of Technical Staff (Data Scientist, Evals)
Posted
Tech Stack
Solid badges = required, outlined = preferred
Responsibilities
- Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness.
- Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality.
- Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices.
- Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements.
- Operate within a small, high-impact team where evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality.
Soft Skills
Research
Culture
High Growth
Requirements
Required: PhD or MS in a technical field or equivalent experience
Regions: Us
Get jobs like this in your inbox
Weekly Python, SQL, AWS hiring trends and salary data — free.
Join 8 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About perplexity
Industry: ai
Size: startup
Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and specialized data sources.
View company profile →Similar Jobs
Member of Technical Staff (Model Behavior Architect)
perplexity · San Francisco
Member of Technical Staff - Evaluations
Reflection AI · San Francisco
Member of Technical Staff (Secure Intelligence Institute)
perplexity · San Francisco
AIML - Data Scientist, Evaluation
Apple · Seattle
Data Scientist
goodie-ai · Remote
Remote