Machine Learning Scientist

Posted Dec 18, 2025

arenaBay Areafull-time

Apply Now

Tech Stack

ExpressGoPythonRustTypeScript

Responsibilities

Design and conduct experiments to evaluate AI model behavior across reasoning, style, robustness, and user preference dimensions.
Develop new metrics, methodologies, and evaluation protocols that go beyond traditional benchmarks.
Analyze large-scale human voting and interaction data to uncover insights into model performance and user preferences.
Collaborate with engineers to implement and scale research findings into production systems.
Author internal reports and external publications that contribute to the broader ML research community.

Benefits

Equity
Gym Membership
Health Insurance

Culture

Cross-Functional TeamsMission-DrivenTransparent Leadership

Requirements

Required: PhD or equivalent research experience in Machine Learning, Natural Language Processing, Statistics, or a related field

Regions: Us

About arena

Industry: ai

Size: startup

Arena is a platform for evaluating how AI models perform in the real world, founded by researchers from UC Berkeley's SkyLab, with a mission to measure and advance the frontier of AI for real-world use. Tens of millions of people use Arena monthly to evaluate frontier systems.

View company profile →

Compensation

Equity: Equity aligned to the markets

arena · Bay Area

Research Scientist

openrouter · Remote (US)

Remote

Data Scientist

arena · Bay Area

Research Scientist - Multimodal Agent, Consumer Devices

OpenAI · San Francisco

ML Scientist