Crawler Development

In Progress Posted Oct 28, 2009 Paid on delivery
In Progress Paid on delivery

We need a crawler library for Java which performs the following:

-It should visit 7 different download websites that we will provide you.

-From those download websites it should grab the list of new software added today (from the What's new list of those sites).

-Based on this, it should visit the details for the programs and collect them in a java class (program name, description, link to screenshot, size, etc.). We have already created an interface for the details and will send you on project start or after bidding

-For 2 download sites we need an extended version that crawls all programs and returns them to us.

-Some notes on the sites:

-We will provide them to you after bidding

-Some of them have RSS and maybe you can use it

-4 of them are in German langauge but we can provide you help if you need to translate some parts.

-If a page is slow or not available, thenm you need to have a timeout

Regarding your solution:

-We need pure Java

-We need clean code + documentation

J2EE Java

Project ID: #538504

About the project

27 proposals Remote project Active Oct 30, 2009