DEVELOPMENT

ML Engineer (ONSITE IN SF)

San Francisco, California
Work Type: Full Time
Dear applicants, please keep in mind that applications without provided salary expectations and active LN profile will not be considered. 
Hope for your understanding.

Location: San Francisco, CA (In-person)
Employment Type: Full-Time
Equity: 0.5% – 1%
Visa: Not available
Experience: 1+ years (exceptional new grads welcome)

We are hiring ML Engineers to implement research ideas reliably and operate full training pipelines end-to-end. This is not a research-only role. This is research-engineering at scale. A seed-stage research-driven ML company focused on mechanistic understanding of model architectures and optimizers.

The team studies:
  • Optimizer–architecture co-design
  • Orthogonalized optimizers and manifold-based training
  • Sparse attention mechanics
  • Data-efficient reasoning models
  • Learning dynamics in data-sparse regimes
The environment blends academic rigor with industrial compute and speed. The team is deliberately long-term oriented and avoids premature commercialization pressure.

You will:
  • Translate research papers into working PyTorch/JAX implementations
  • Run distributed transformer training
  • Debug divergence and instability
  • Optimize throughput
  • Build full pipelines (data → training → evaluation)
  • Reason about learning dynamics and architecture tradeoffs
  • The bar is slope and research intuition, not years.

What You’ll Own
  • Reliable implementation of novel architectures
  • Distributed transformer training at scale
  • Training stability and performance debugging
  • Evaluation frameworks
  • Optimization reasoning alongside researchers

Must-Have Requirements
  • Strong PyTorch or JAX proficiency
  • Hands-on transformer training experience
  • Experience with distributed training setups
  • Debugging divergence and instability
  • Ability to read and implement research papers
  • Research intuition around optimization and learning dynamics
  • High growth slope

Nice to Have
  • Megatron-LM, DeepSpeed, xformers
  • End-to-end pipeline ownership
  • Research-engineering team experience
  • Mathematical depth (optimization, information theory, etc.)
  • Competitive programming / theory-heavy background

Submit Your Application

You have successfully applied
  • You have errors in applying