Machine Learning Engineer, LLM Inference Optimization

GMI Cloud · san francisco bay area, san francisco bay area, United-States

Location
san francisco bay area
Job Type
Full-time
Posted
June 30, 2026

Job Description

About Us

GMI Cloud is a fast-growing AI infrastructure company backed by Headline VC and one of only seven cloud providers worldwide to earn NVIDIA's prestigious Reference Platform Cloud Partner designation. We operate 8 of our own GPU clusters across the U.S. and Asia, delivering a full spectrum of services from GPU compute to AI model inference API solutions. As an NVIDIA Reference Platform Cloud Partner, our infrastructure meets the highest standards for performance, security, and scalability in AI deployments. We empower AI startups and enterprises to build AI without limits, providing everything they need to prototype, train, and deploy AI models quickly and reliably.

About this role


GMI Cloud is building the leading inference optimization solution and the most advanced token platform in the global token market — and we are hiring world-class Machine Learning Engineers to make GMI the new indu...

Ready to Apply?

Submit your application for Machine Learning Engineer, LLM Inference Optimization at GMI Cloud

Apply Now