Scrapy Project #2
£10-20 GBP
Paid on delivery
Only developers with Scrapy framework experience should apply.
I need an existing Scrapy project to be extended with additional functionality:
1. Develop a pipeline for verifying scraped email addresses. Verification needs to include the following sequence of email verifications - Regexp formatting, WHOIS domain query, DNS MX and A records query, connect to mail server and send MAILTO and TEXT commands and read the mail server responses.
2. One additional web spider developed for [login to view URL] Spider will need to follow secondary pages to get member details.
3. Data ingestion spider developed to ingest email lists data from MySQL database. Emails are to be passed through the email validation pipeline, then consequently checked if emails exist in the public domain by searching for them on google. If they exist online, then the site URL needs to be saved.
Code must be written in Scrapy / Python (using XPath expressions where applicable). No other platforms but Scrapy are allowed for this project. Existing code and DB schemas will be provided to successful bidder. You need to use your own server to develop the code.
For an experienced Scrapy developer this will be a 4-5 hour project, so please quote reasonably.
More work will be available after you successfully deliver this project.
Only developers with Scrapy framework experience should apply.
I need an existing Scrapy project to be extended with additional functionality:
1. Develop a pipeline for verifying scraped email addresses. Verification needs to include the following sequence of email verifications - Regexp formatting, WHOIS domain query, DNS MX and A records query, connect to mail server and send MAILTO and TEXT commands and read the mail server responses.
2. One additional web spider developed for http://www.eia.co.uk/buyers-guide. Spider will need to follow secondary pages to get member details.
3. Data ingestion spider developed to ingest email lists data from MySQL database. Emails are to be passed through the email validation pipeline, then consequently checked if emails exist in the public domain by searching for them on google. If they exist online, then the site URL needs to be saved.
Code must be written in Scrapy / Python (using XPath expressions where applicable). No other platforms but Scrapy are allowed for this project. Existing code and DB schemas will be provided to successful bidder. You need to use your own server to develop the code.
For an experienced Scrapy developer this will be a 4-5 hour project, so please quote reasonably.
More work will be available after you successfully deliver this project.
Project ID: #19717181
About the project
2 freelancers are bidding on average £58 for this job
Hi, I am Usman. I have 6 years of experience in the Web & Mobile App Development Department and Designing. I have reviewed the description and understand it very well. I have understood your requirement with the More