Tech Stack
Machine LearningSoftware EngineeringStatisticsLarge Language ModelsReinforcement LearningRlhfRlaifData PipelinesCoding AgentsTool-Using AgentsProduction ML Systems
Solid badges = required, outlined = preferred
Responsibilities
- Design and run experiments to improve agentic model behavior for complex software and plugins.
- Own end-to-end improvements to the post-training stack, including RL, data pipelines, graders, reward signals, evals, and diagnostics.
- Build evals and environments to expose model failures, then turn those failures into training data, product fixes, or new research directions.
- Partner with product teams to translate user needs into model improvements.
- Improve the machinery for large-scale training and launch, focusing on experiment velocity, reliability, and cost.
Soft Skills
Research TasteEngineering ExecutionCross-Functional Collaboration
Culture
Cross-Functional TeamsCustomer-ObsessedMission-Driven
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly Machine Learning, Software Engineering, Statistics hiring trends and salary data — free.
Join 7 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About OpenAI
Industry: ai
Size: large
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity by developing and safely deploying AI systems. They are committed to pushing the boundaries of AI capabilities while ensuring safety and human needs are at the core of their work.
View company profile →Similar Jobs
Agent Post-Training, Computer Use Research
OpenAI · San Francisco
Agent Post-Training, Artifacts Research
OpenAI · San Francisco
Agent Post-Training, Context Research
OpenAI · San Francisco
Agent Post-Training, API & Power Users
OpenAI · San Francisco
Agent Post-Training, Frontier Evals and Environments Research
OpenAI · San Francisco