Machine Learning Engineer, LLM Inference Optimization

GMI Cloud · san francisco bay area, san francisco bay area, United-States

Location

san francisco bay area

Job Type

Full-time

Posted

June 30, 2026

Job Description

About Us 
GMI Cloud  is a fast-growing AI infrastructure company backed by Headline VC and one of only seven cloud providers worldwide to earn NVIDIA's prestigious Reference Platform Cloud Partner  designation. We operate 8 of our own GPU clusters across the U.S. and Asia, delivering a full spectrum of services from GPU compute to AI model inference API solutions. As an NVIDIA Reference Platform Cloud Partner, our infrastructure meets the highest standards for performance, security, and scalability in AI deployments. We empower AI startups and enterprises to build AI without limits, providing everything they need to prototype, train, and deploy AI models quickly and reliably. 
About this role 

GMI Cloud is building the leading inference optimization solution and the most advanced token platform  in the global token market — and we are hiring world-class Machine Learning Engineers to make GMI the new indu...
        

Ready to Apply?

Submit your application for Machine Learning Engineer, LLM Inference Optimization at GMI Cloud

Apply Now