AI Engineer, Zalo AI

Back

Hồ Chí Minh

Full-time

Experience with programming languages such as C++ and Python;
Solid knowledge of Data Structures and Algorithms;
Proficiency with deep learning frameworks such as PyTorch and TensorRT;
Experience with system optimizations for model serving, such as batching, caching, load balancing, and model parallelism;
Experience with algorithmic optimizations for inference, such as quantization, distillation, and speculative decoding;
Experience with HTTP, gRPC, and Triton Inference Server;
Experience with large-scale, high-concurrency production serving;
Ability to quickly learn new technologies, frameworks, and algorithms;

Nice to have:

Experience with low-level optimizations for inference, such as GPU kernels;
Experience with building solutions with MLOps tools and frameworks such as Kubernetes, Kubeflow, etc;•

Take a look inside
<nhqiqrkipnqg__fpnrsokczepsqs/>