(Senior) Lead Data Engineer, Zalo
Zalo is looking for a Lead Data Engineer with 5+ years of experience, specializing in Big Data, AutoML, Feature Store, and Kubernetes. Proficiency in optimizing HDFS, building high-performance APIs, ensuring data privacy, security, and point-in-time correctness is essential. The candidate must possess the ability to lead a team, provide technical mentorship, coordinate cross-team efforts, and collaborate with major partners (Fiza, Adtima, VAS).
🤖 What you will do
1. Professional skills
Big Data & Distributed Systems:
- Proficient in Hadoop ecosystem (HDFS, YARN, Hive, Spark, Flink).
- Storage & processing optimization: data compression (Snappy → Zstandard), partitioning, bucketing, file format (ORC, Parquet).
- HDFS administration: backup, cleanup, archiving, capacity planning.
AutoML & MLOps:
- Design and operate AutoEDA systems, auto-training, evaluation, and prediction at scale.
- Deep understanding of end-to-end ML pipeline, automated feature engineering, model registry, serving.
Feature Store:
- Build and operate a Feature Store with >3,000 features, ensuring point-in-time correctness, low-latency serving.
- Support batch and real-time ingestion, and consistency between online/offline stores.
API & Middleware Development:
- Develop high-throughput API (gRPC, REST) on Kubernetes (K8s), optimize latency & scalability.
- CI/CD, observability (Prometheus, Grafana, OpenTelemetry), canary/blue-green deployment.
Cloud & Infra:
- Proficient in at least 1 cloud (GCP/AWS/Azure): GCS/S3, BigQuery, Dataflow, Cloud Composer.
- IaC (Terraform), container orchestration (K8s, Helm), service mesh (Istio – bonus).
2. Architecture & design skills
- Design scalable, fault-tolerant, observable systems.
- Trade-off analysis: batch vs streaming, consistency vs availability, cost vs performance.
- Data modeling: star schema, slowly changing dimensions, data vault (if needed).
Security & Governance:
- Data encryption at rest/in transit, access control (Ranger, Apache Atlas).
- Comply with data privacy (GDPR, PDPA), anonymization, consent management.
3. Leadership & Management Skills
Mentoring & Knowledge Sharing:
- 1:1 coaching, code review, tech talk, writing internal documentation.
- Building tech culture: best practices, engineering excellence.
Team management:
- Recruitment, competency assessment, member development planning.
- Assign tasks to each person's strengths.
Cross-functional Collaboration:
- Work closely with DS, DE, Safety, Product, Partner teams.
- Translate business requirements → technical solutions.
4. Soft Skills
- Ownership & Proactiveness: proactively detect bottlenecks, propose improvements.
- Problem-Solving: handle production incidents, root cause analysis (RCA).
- Business Acumen: clearly understand partner use-cases (Fiza, Adtima, VAS) to prioritize development.
- Communication: present complex ideas in an easy-to-understand way to non-tech stakeholders.
5. Tools & languages
- Language: Python (expert), Scala/Java (bonus), SQL (complex query).
- Framework: Airflow, dbt, Feast/KFP/TFX.
- Monitoring: ELK stack, Jaeger, Prometheus + Grafana.
- Versioning: Git, trunk-based development, semantic versioning.
👾 What you will need
- Candidates with 5+ years of experience in Data Engineering, priority is given to those who have held the position of Lead/Tech Lead.
- Have built a system to process >1TB/day or >1K QPS API.
- Have experience leading a team of 5+ members.
- Priority is given to candidates who have worked with AutoML, Feature Store, DMP/CDP.