URL Copied
Hồ Chí Minh
Full-time

Lead Data Engineer, Zalo

What you will do

  • Design, develop, and maintain the data processing pipeline using distributed systems (Python, Spark, Kafka, Zookeeper, ELK, Scylladb, and Redis/Sentinel) and monitor/troubleshoot for a seamless and uninterrupted data flow;
  • Optimize and tune NoSQL databases, including key-value and wide-column stores, to ensure efficient data storage and retrieval;
  • Build backend systems to automate fraud detection, prevention, and mitigation workflows;
  • Research, develop, and implement tools and flows for clone-spam detection and anomaly detection;
  • Implement data quality measures and data validation techniques to ensure the accuracy and integrity of the data;
  • Perform data analysis to identify patterns, trends, and insights for clone-spam detection and anomaly detection;
  • Stay updated with the latest trends and technologies in data processing and recommend improvements to existing systems;
  • Document data processing procedures, configurations, and changes for easy reference and knowledge sharing;
  • Collaborate with cross-functional teams (engineering, product, security) to improve overall platform protection;
  • Delegate tasks, provide performance feedback, and coach team members for growth. Track team progress and ensure individual and collective goals are met.

What you will need

  • Bachelor's or Master's degree in Computer Science, Data Science, or a related field;
  • Proven experience in managing team size at least 3 members;
  • Proven experience in designing and developing data processing pipelines using Python, Spark, Kafka, Zookeeper, ELK, Scylladb, and Redis/Sentinel;
  • Strong knowledge of NoSQL databases, including key-value and wide-column stores, and experience with database optimization and tuning;
  • Solid understanding of data modeling and database design principles;
  • Strong analytical and problem-solving skills with a keen eye for detail;
  • Self-motivated, proactive, and able to work independently with minimal supervision.
    Nice to have:
  • Experience with clone-spam detection and anomaly detection techniques is a plus;
  • Proficiency in Linux systems, bash scripting, and system administration tasks;
  • Knowledge of machine learning and data analytics is a plus.