Find Jobs
Hire Freelancers

8 simple scrapers needed

$30-250 USD

Completed
Posted over 14 years ago

$30-250 USD

Paid on delivery
MUST follow the coding instructions laid out below (no deviations or substitutions). I have attached sample data and details for the 8 sites to scrape. The scraper definition is also attached so you can see proper formatting for JSON. note: I will have many more of these for developers that perform a good job in a timely and cost-effective manner. Thanks, Scott Scraping Specs - Written in Ruby, NO TABS (2 spaces instead). - Run from the command line taking two arguments - the first should be an integer for the scrape ID, the second should be the URL for the VENUE where the scrape starts: ./[login to view URL] <ID:integer> <URL:string> ./[login to view URL] 111 [login to view URL] - Must use Curl for GET-ing URLs GEM: curb - Must only use standard Ruby regex for parsing, OR hpricot OR nokogiri as an alternative GEM: hpricot GEM: nokogiri - Must output JSON as a finished product, sample data included below GEM: json - Must *NOT* use any other GEMS outside of these three: curb, hpricot, nokogiri, json - The script should return only 1 of 2 things formatted in JSON. Either an ERROR, or the actual data if everything works. - If there is any kind of error, it needs to output json as defined with a specific error code and message, or at least the standard error code and message: {"scrape": { "id": <SCRAPE_ID_FROM_INITIAL_ARGUMENT_1>, "url": "<URL_FROM_INITIAL_ARGUMENT_2>", "success": <BOOLEAN: true/false>, "error": { "code": <VALID_ERROR_CODE>, "description": "<TEXT_WITH_WHATEVER_ERROR_MESSAGE_YOU_WANT>" } } VALID ERROR CODES ARE: 10: (Generic error of any kind) 20: (URL GET error - any error involving GET-ing a URL) 30: (PARSE error - any error involving parsing the data) SAMPLE ERROR RETURN: {"scrape": { "id": 111, "url": "http://foo.com/calendar", "success": false, "error": { "code": 10, "description": "Problem doing something in the foo function." } } - If it succeeds, it needs to output json as defined with at least the REQUIRED following data in proper format: {"scrape": { "id": <SCRAPE_ID_FROM_INITIAL_ARGUMENT_1>, "url": "<URL_FROM_INITIAL_ARGUMENT_2>", "success": <BOOLEAN: true/false>, "events": [ { "title": "<STRING: Name of the event REQUIRED>", "start_date": "<DATE: date of the event, or date the event starts (MM/DD/YYYY) REQUIRED>", "start_time": "<DATETIME: date/time the event starts in *24 HOUR LOCAL TIME* (MM/DD/YYYY HH:MM) OPTIONAL>", "end_date": "<DATE: date the event ends (MM/DD/YYYY) OPTIONAL>", "end_time": "<DATETIME: date/time the event ends in *24 HOUR LOCAL TIME* (MM/DD/YYYY HH:MM) OPTIONAL>", "repeating": <INTEGER: 0 if the event happens once, 1 if the event repeats weekly REQUIRED>, "repeats_on": "<STRING: *full* name of the day of week the event repeats on (Thursday, Friday, etc.) OPTIONAL>", "repeats_until": "<DATE: date the event repeats until (MM/DD/YYYY) OPTIONAL>", "image_url": "<STRING: url for an image associated with this event OPTIONAL>", "ticket_url": "<STRING: url to buy tickets for this event OPTIONAL>", "ticket_prices": "<STRING: descriptional text about the ticket price OPTIONAL>", "description": "<STRING: any freeform descriptive text about the event OPTIONAL>", "bands": [ { "name": "<STRING: band name>" }, { "name": "<STRING: band name>" } ] } ] } SAMPLE DATA: {"scrape": { "id": 111, "url": "http://foo.com/calendar", "success": true, "events": [ { "title": "2$ off Lone Star!", "start_date": "01/01/2010", "repeating": 1, "repeats_on": "Tuesday", "repeats_until": "01/01/2011", "image_url": "http://pictures.com/of/lone_star.jpg", }, { "title": "Rock Your Mom's House", "start_date": "01/10/2010", "start_time": "01/10/2010 19:00", "end_time": "01/10/2010 22:00", "repeating": 0, "image_url": "http://yourmoms.com/house.gif", "ticket_url": "http://buytix.to/yourmoms", "ticket_prices": "$8.00 all ages", "description": "These people really know how to stick it to you.", "bands": [ { "name": "Buttcheeck Falcons" }, { "name": "Foo Fighters" } ] } ] } NOTES: - All TIMES / DATETIMES should be in the LOCAL TIME of whatever VENUE is being scraped. Usually this will just be the time that you're scraping, but BE SURE. - ALWAYS return a valid error code if anything goes wrong. Even if it's just the generic error message.
Project ID: 580399

About the project

7 proposals
Remote project
Active 14 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
I can do this with no problem.
$200 USD in 5 days
5.0 (1 review)
2.6
2.6
7 freelancers are bidding on average $276 USD for this job
User Avatar
Can implement
$500 USD in 8 days
5.0 (34 reviews)
6.7
6.7
User Avatar
I have handled many scrapping projects successfully and can deal with this project to your satisfaction
$234 USD in 6 days
4.9 (30 reviews)
6.2
6.2
User Avatar
Hi, Check PM. Thanks, Sumeet.
$200 USD in 5 days
5.0 (1 review)
2.6
2.6
User Avatar
Can be done. See my PMB!
$250 USD in 5 days
5.0 (1 review)
1.9
1.9
User Avatar
Dear Sir, We can do it perfectly for you. Thanks!
$250 USD in 4 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
austin, United States
5.0
4
Payment method verified
Member since Jan 5, 2008

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.