Member of Technical Staff - Efficient ML
Posted
Tech Stack
Responsibilities
- Optimize training efficiency using techniques like dataloaders, fusion, activation rematerialization, and gradient checkpointing.
- Enhance GPU and kernel performance through Nsight profiling, Triton/CUDA kernels, fused operations, and Flash-attention style speedups.
- Implement inference optimizations including low-latency serving, continuous batching, speculative decoding, and quantization.
- Ensure infrastructure and reliability by managing SLURM/Kubernetes multi-node jobs, checkpoint hygiene, and GPU failure handling.
- Contribute to building AI for creating world simulations.
Culture
On-Call RotationDeep Work FocusCollaborative Space
Requirements
Regions: Us
Get jobs like this in your inbox
Weekly Kubernetes, Node.js hiring trends and salary data — free.
Join 6 engineers getting weekly insights
Get market intelligence in your inbox
Free weekly insights on tech hiring trends, salaries, and in-demand stacks.
Already a subscriber? Sign in
About embedding-vc
Industry: ai
Size: startup
Moonlake is an AI company focused on creating world simulations, committed to an on-site, in-person team.
View company profile →