Senior Data Engineer, Zalo Platform
As a communication platform, Zalo makes it easier for family, friends, and co-workers to connect. A Data Engineer working on Zalo products, you have a chance to make a positive impact on more than 50M users by leveraging the power of our data. This role specifically focuses on data pipeline for analytics and machine learning projects.
🤖 What you will do
Data Platform Development (70%)
1. Build and Scale Data Pipelines:
- Design, build, and optimize robust ETL/ELT pipelines to ensure high-quality, reliable, and timely data.
- Work with large-scale data processing frameworks such as Spark and Kafka to enable real-time and batch data workflows.
- Leverage orchestration tools like Airflow to manage complex workflows.
2. Integrate and Maintain Data Systems:
- Connect diverse data sources to support analytics, product intelligence, and machine learning needs.
- Develop scalable and maintainable data architecture for data lakes and data warehouses.
- Implement data quality monitoring, validation, and alerting mechanisms using best practices and modern tools.
Data Insight & Collaboration (30%)
1. Translate Business Needs into Data Solutions:
- Collaborate closely with Product Managers, Analysts, and Data Scientists to understand business rules and translate them into efficient data models and logic.
- Build and maintain metrics, dashboards, and reports that provide clear visibility into product and user performance.
2. Enable Stakeholder Self-service:
- Empower internal teams by building data marts, documentation, and scalable solutions that reduce dependency on engineering for basic queries.
👾 What you will need
- 3+ years of hands-on experience in data engineering involving large-scale data.
- Proven track record of delivering scalable and maintainable data solutions in a production environment.
Technical Skills:
- Advanced proficiency in Python (pyspark) and SQL (Spark SQL).
- Strong experience with distributed data processing tools (e.g., Apache Spark, Kafka).
- Experience building data workflows with orchestration tools like Apache Airflow.
- Familiarity with cloud platforms (AWS, GCP, or Azure) and modern data warehousing solutions (e.g., BigQuery, Snowflake, Redshift).
- Understanding of CI/CD pipelines, containerization (Docker), and version control (Git).
Soft Skills:
- Excellent communication and stakeholder management skills.
- Ability to work independently and proactively in a fast-paced, dynamic environment.
- Strong analytical thinking, attention to detail, and problem-solving ability.