Location
Singapore
Job Type
Full-time
Posted
July 03, 2026

Job Description

The opportunity

:
  • Own the architecture of production AI systems — inference stacks, fine-tuning pipelines, retrieval and evaluation infrastructure, monitoring 

  • Build on frontier models (Claude, GPT, and peers) with real rigor: tool use, structured outputs, context and cost management, evals, and guardrails — not just prompt-and pray

  • Deploy and operate open-source models (Llama, Qwen, Mistral, DeepSeek, and whatever comes next) on our cloud environment — including quantization, serving frameworks (vLLM, TGI, SGLang, TensorRT-LLM), and multi-GPU inference.

  • Make the frontier-vs-open-source call deliberately, on cost, latency, control, and data sensitivity grounds — and be able to defend it

  • Design the cloud infrastructure underneath it all: GPU orchestration, autoscaling, cost controls, VPC/networking, IAM, observability. This is not a “hand it to DevOps” role

  • Fine-tune, distill, and evaluate models agains...
  • Ready to Apply?

    Submit your application for Applied AI Engineer, Senior Associate/Manager - Technology Consulting at EY

    Apply Now