Software Engineer

XOR.ai
XOR.ai

Software Engineering

San Francisco, CA, USA

Posted on Jul 5, 2026

About XOR

XOR is a platform that helps world-class companies pushing the frontier of AI hire exceptional ML, RL, and AI engineering talent.

About Our Client

Our client is a well-funded AI startup working on next-generation training systems for large language models. The team is small, technical, and moving fast, with a strong focus on hands-on engineering over process.

About the Role

We're looking for engineers who are keen to find where the best coding models still fail on real software work - large codebases with existing conventions and technical debt, ambiguous design decisions, multi-step problems - and build rigorous, gradeable test cases around those failures. You'll own each case end to end.

What You'll Do

  • Hunt for where coding models break across software, and build the hard, high-fidelity scenarios that expose those failures and push the ceiling of what the best models can do
  • Own the hardest problems on the roadmap end-to-end: multi-step workflows, realistic stakeholder interactions, large codebases with real conventions and technical debt, and challenging system design
  • Build verification robust enough that a model can't hack it, and tell genuine capability gaps apart from artifacts of your own setup
  • Direct coding agents heavily in day-to-day work, evaluate their output critically, and recognize when they are failing in subtle ways
  • Build the tooling your own work depends on
  • Mentor newer engineers on the team as it grows

What We're Looking For

  • Deep software engineering experience across multiple domains, with genuine expertise in at least one specialty: infrastructure, distributed systems, performance, security, compilers, databases, or similar
  • Proficiency in Python
  • Extensive hands-on experience with coding agents, including an intuition for where they cut corners and how to direct them well
  • Strong intuition for how models behave, even without prior ML or AI experience - you can anticipate where a model will take shortcuts and design around that
  • Comfort working independently on complex, ambiguous problems with minimal direction
  • Track record of owning work end-to-end in previous roles

Nice to Have

  • Senior or staff software engineer at a company known for engineering rigor (e.g., a frontier AI company, infrastructure startup, or systems-heavy team) wanting to apply that experience to model training
  • Deep specialty expertise in an area current models struggle with (distributed systems, low-level performance, security, compilers)
  • Excited about building a new hard problem from scratch on a regular basis
  • Early engineer at a previous startup who shipped independently and wants to do it again in AI
  • Significant time spent building with coding agents, writing about their failure modes, or contributing to agent evaluation work