Research Engineer / Research Scientist - Post-training
San Francisco, CA, USA
About XOR
XOR is a platform that helps world-class companies pushing the frontier of AI hire exceptional ML, RL, and AI engineering talent.
About Our Client
Our client is a well-funded AI startup working on next-generation training systems for large language models. The team is small, technical, and moving fast, with a strong focus on hands-on engineering over process.
About the Role
This team is investigating how far self-directed learning can push model capability, and is looking for Research Engineers or Research Scientists to push the frontier of post-training on large language models - a role that blends research and engineering, requiring you to implement novel approaches and shape research directions.
What You'll Do
- Train and evaluate models on proprietary training tasks to validate data quality, surface gaps in task coverage, and close the feedback loop between task design and model capability
- Architect and optimize training infrastructure, from training abstractions to distributed experiment management, using frameworks like Verl, OpenRLHF, or similar - helping scale systems to handle increasingly complex research workflows
- Design, implement, and test training environments, evaluations, and methodologies for RL agents
- Profile and optimize training runs end-to-end, from data loading through reward computation, to maximize experiment throughput and shorten the research iteration cycle
What We're Looking For
- Experience running end-to-end LLM post-training pipelines
- Proficiency in Python and PyTorch or JAX
- Experience with at least one modern RL training framework
- Experience building and operating ML infrastructure at scale
Nice to Have
- Experience evaluating model outputs and building reward or evaluation signals
- Staying current on post-training research and translating papers into running code
- Strong opinions (loosely held) about how to structure training code for reproducibility and fast iteration
- Balancing research exploration with engineering rigor
- Strong systems design and communication skills
