Crawler that takes products from any site in any page!

Closed Posted 7 years ago Paid on delivery
Closed Paid on delivery

Hi,

I need a crawler that knows to eat products from any site/page (the site owner knows about it!):

1. You get the URL of the homepage, parse it with ganon, find all the images of the homepage and their links that should be the SRC of the closest A parent backwards

2. You group the found images by one of these options IN THIS ORDER (the first that is applicable):

a) Grouped by [login to view URL]

b) Grouped by [login to view URL] + [login to view URL] if there are into the IMG TAG (don't try to reach width and height in PHP at this stage)

c) Grouped by number of directory levels (/) into [login to view URL] && number of directory levels (/) into [login to view URL]

d) Grouped by margin of similar_text of [login to view URL] && margin of similar_text of [login to view URL]

e) Grouped by [login to view URL] + [login to view URL] after reaching width and height of the images via PHP

3. You start with the group of images that have the higher number of images, and you look backwards for the price and the name of the item after checking that the name looks like [login to view URL] (in case of doubt open the item page and take the TITLE but it should be rare!)

4. When you find 3 images of the same image's group that have url, name and price, that's all! You will stop the headaches and start to search in the homepage and then in all the sites ONLY the CSS classes that you have just found now

5. For every product, you find the tags from CONCAT([login to view URL], '', [login to view URL], '', NAME) - a word that appears at least twice is a tag...

You release first homepage results ASAP and IN PARALLEL you can run a script that takes other links from the sitemap or from this analyze where you already have a ganon object that will give you all the links of the page easily and quickly

Good luck!

Codeigniter Engineering PHP Software Architecture Software Testing

Project ID: #12094258

About the project

11 proposals Remote project Active 7 years ago

11 freelancers are bidding on average $286 for this job

Agiletechstudio

Hi there! We are a team of qualified and professional Web application developers. Our expertise are PHP and javaScript based CMS and frameworks. We are dedicated to provide high end UI/UX design, reactive and fast runn More

$222 USD in 9 days
(26 Reviews)
5.5
retroshell

Hello, I've read your project notes, I got what do you exactly want, I can make the software in php/mysql/JQuery/CSS. Also The application will be have an admin panel to you can be able to view the grabbed conte More

$140 USD in 3 days
(22 Reviews)
4.9
nileshbakotiya

Hi, Thanks for the opportunity. As per your requirement, i would like to tell you that I have a very strong experience of more than 7 years in this field of design and development. Please spare a moment to discuss t More

$133 USD in 4 days
(49 Reviews)
5.3
vikalplearning

Hi We are good with web mining and crawling data from http using java and other language coding. Yes we can classify the mine data and further result can be store and produce the report. Chat more please.

$200 USD in 3 days
(4 Reviews)
1.7