Machine Learning Engineer, New Graduate
Software Engineering
San Francisco, CA, USA
About XOR
XOR is a platform that helps world-class companies pushing the frontier of AI hire exceptional ML, RL, and AI engineering talent.
About Our Client
Our client is a well-funded AI startup working on next-generation training systems for large language models. The team is small, technical, and moving fast, with a strong focus on hands-on engineering over process.
About the Role
This team designs and builds training tasks that safely advance model capabilities in machine learning research and engineering - specifically, teaching frontier models to do the work of an ML engineer or researcher. The role blends research and engineering: staying current with the latest research, developing novel approaches, and realizing them in code, with full ownership and autonomy over what you build. This track is for new graduate ML engineers.
What You'll Do
- Design and build training tasks and scoring schemes that produce clean, learnable signals for frontier models on ML research and engineering tasks
- Build deep expertise across the frontier of ML research, training, and inference infrastructure
- Collaborate with others to brainstorm and create new ideas and tools to improve the task-building process
What We're Looking For
- Strong ML fundamentals and broad research interests - you read many papers or tutorials, understand topics deeply, and have the creativity to translate them into rigorous, verifiable problems
- Proficiency in Python and systems programming; ideally PyTorch or JAX
- Smart problem solvers who take ownership and drive solutions end-to-end
- Passion for staying current with the rapidly evolving ML infrastructure landscape
- Ability to meet throughput expectations and respond quickly to feedback
Nice to Have
- Expert knowledge in an active DL/ML research area, with publications or public code to show for it - research experience (PhD, MS) is a big plus
- Deep understanding of transformer internals
- Strong expertise in kernel development (CUDA, Triton, Pallas), optimizing non-trivial neural modules to specific hardware
- Research projects, coursework, or personal work involving training environments (any framework, any scale)
- Open-source contributions to ML infrastructure or RL tooling
- Experience with any cloud platform (AWS, GCP, Azure) or infrastructure-as-code tools
