Research Engineer / Research Scientist - Post-training

XOR.ai
XOR.ai

San Francisco, CA, USA

Posted on Jul 5, 2026

About XOR

XOR is a platform that helps world-class companies pushing the frontier of AI hire exceptional ML, RL, and AI engineering talent.

About Our Client

Our client is a well-funded AI startup working on next-generation training systems for large language models. The team is small, technical, and moving fast, with a strong focus on hands-on engineering over process.

About the Role

This team is investigating how far self-directed learning can push model capability, and is looking for Research Engineers or Research Scientists to push the frontier of post-training on large language models - a role that blends research and engineering, requiring you to implement novel approaches and shape research directions.

What You'll Do

  • Train and evaluate models on proprietary training tasks to validate data quality, surface gaps in task coverage, and close the feedback loop between task design and model capability
  • Architect and optimize training infrastructure, from training abstractions to distributed experiment management, using frameworks like Verl, OpenRLHF, or similar - helping scale systems to handle increasingly complex research workflows
  • Design, implement, and test training environments, evaluations, and methodologies for RL agents
  • Profile and optimize training runs end-to-end, from data loading through reward computation, to maximize experiment throughput and shorten the research iteration cycle

What We're Looking For

  • Experience running end-to-end LLM post-training pipelines
  • Proficiency in Python and PyTorch or JAX
  • Experience with at least one modern RL training framework
  • Experience building and operating ML infrastructure at scale

Nice to Have

  • Experience evaluating model outputs and building reward or evaluation signals
  • Staying current on post-training research and translating papers into running code
  • Strong opinions (loosely held) about how to structure training code for reproducibility and fast iteration
  • Balancing research exploration with engineering rigor
  • Strong systems design and communication skills