We are looking for a web crawler data engineer to join our team and help us collect and process large amounts of data from the web. You will be responsible for designing, developing and maintaining web crawlers that can handle various challenges such as bot prevention, dynamic content, pagination, authentication and rate limiting. You will also be responsible for ensuring the quality and reliability of the crawled data and storing it in an efficient and scalable way.
What you will do
- Develop and maintain web crawlers (web forums, social sites...);
- Monitor and troubleshoot web crawling issues and optimize performance;
- Store and manage data using databases, cloud storage or other solutions;
- Collaborate with other engineers, analysts and stakeholders to understand data requirements and deliver solutions.
What you will need
- Experience with web crawling, scraping and data extraction techniques;
- Experience with databases, cloud storage or other data storage solutions;
- Excellent communication, problem-solving and analytical skills.
- Bonus points:
- Experience with Scrapy, Selenium or other web crawling tools;
- Experience with Bot Detection and Captcha bypass.