Member of Technical Staff, Inference

Inferact · singapore, singapore, Singapore

Location

singapore

Job Type

Full-time

Posted

June 30, 2026

Job Description

Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of models and hardware—a position that took years to build. 
About the Role We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. Models grow larger. Architectures shift: mixture-of-experts, multimodal, agentic. Every breakthrough demands innovations on the inference engine itself. You'll work at the core of vLLM, optimizing how models execute across diverse hardware and architectures. Your work will directly impact how the world runs AI inference. 
Skills and Qualifications Minimum qualifications: 
Bachelor's degree or equivalent experience in computer science, engineering, or similar. 
Deep understanding of transformer architec...
        

Ready to Apply?

Submit your application for Member of Technical Staff, Inference at Inferact

Apply Now