Design, develop, and maintain the data processing pipeline using distributed systems (Python, Spark, Kafka, Zookeeper, ELK, Scylladb, and Redis/Sentinel) and monitor/troubleshoot for a seamless and uninterrupted data flow;
Optimize and tune NoSQL databases, including key-value and wide-column stores, to ensure efficient data storage and retrieval;
Build backend systems to automate fraud detection, prevention, and mitigation workflows;
Research, develop, and implement tools and flows for clone-spam detection and anomaly detection;
Implement data quality measures and data validation techniques to ensure the accuracy and integrity of the data;
Perform data analysis to identify patterns, trends, and insights for clone-spam detection and anomaly detection;
Stay updated with the latest trends and technologies in data processing and recommend improvements to existing systems;
Document data processing procedures, configurations, and changes for easy reference and knowledge sharing;
Collaborate with cross-functional teams (engineering, product, security) to improve overall platform protection;
Delegate tasks, provide performance feedback, and coach team members for growth. Track team progress and ensure individual and collective goals are met.
What you will need
Bachelor's or Master's degree in Computer Science, Data Science, or a related field;
Proven experience in managing team size at least 3 members;
Proven experience in designing and developing data processing pipelines using Python, Spark, Kafka, Zookeeper, ELK, Scylladb, and Redis/Sentinel;
Strong knowledge of NoSQL databases, including key-value and wide-column stores, and experience with database optimization and tuning;
Solid understanding of data modeling and database design principles;
Strong analytical and problem-solving skills with a keen eye for detail;
Self-motivated, proactive, and able to work independently with minimal supervision. Nice to have:
Experience with clone-spam detection and anomaly detection techniques is a plus;
Proficiency in Linux systems, bash scripting, and system administration tasks;
Knowledge of machine learning and data analytics is a plus.