Lead Data Scientist, ZVideo
We are looking for a Lead Data Scientist to work on video recommendation and video understanding using Vision-Language Models (VLMs), Video Large Language Models (VLLMs), and Retrieval-Augmented Generation (RAG) techniques.
This is a unique opportunity to work at the intersection of cutting-edge AI research and production-scale systems. You will build end-to-end models for content understanding and recommendation, following full MLOps best practices, and directly impacting tens of millions of users on Zalo Video.
You will also face hypergrowth challenges, optimizing large-scale systems, and solving global-scale problems in recommendation and video understanding.
🤖 What you will do
- Design, train, and deploy video understanding models using VLMs, VLLMs, and RAG pipelines for semantic feature extraction, tagging, captioning, and content understanding;
- Build and optimize recommendation models leveraging multimodal embeddings (video, audio, text) for candidate retrieval and ranking;
- Implement end-to-end MLOps pipelines, including data ingestion, feature engineering, model training, deployment, monitoring, and retraining;
- Work with engineering teams to integrate large-scale embeddings and RAG outputs into real-time recommendation systems;
- Conduct experiments and A/B tests to evaluate the impact of new models on user engagement metrics (watch time, CTR, retention);
- Solve hypergrowth challenges, scaling systems to millions of daily active users and global-scale video datasets;
- Stay up-to-date with state-of-the-art multimodal AI research and translate it into production-ready solutions.
👾 What you will need
- Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or related fields, PhD is a plus;
- Experience in machine learning, preferably in recommendation systems, multimodal modeling, or video AI;
- Hands-on experience with Vision-Language Models (e.g., CLIP, BLIP, VideoCLIP), Video LLMs, and RAG architectures;
- Strong Python and ML framework skills (PyTorch, TensorFlow, Hugging Face);
- Experience in large-scale data processing (Spark, Ray, or similar) and real-time serving;
- Familiarity with retrieval, ranking, and personalized recommendation models;
- Experience in MLOps pipelines: automated training, CI/CD, monitoring, versioning, and deployment;
- Strong problem-solving skills and ability to deliver production-ready, scalable ML solutions.
Nice to have:
- Experience in self-supervised learning, multimodal contrastive learning, or LLM prompt engineering;
- Prior work on video recommendation platforms or hypergrowth products;
- Experience in deploying RAG pipelines at scale;
- Knowledge of hypergrowth environments and global-scale ML system design.