Cancelled

Chrome Extension that Downloads Pages and Takes Screenshots

I need a Chrome extension that pulls a JSON list of urls from a RESTful web service url (to be defined later). It must then navigate to each url (processing it's javascript), then save the HTML (not necessarily equivalent to view source, but the post-javascript HTML) to an Amazon S3 bucket. It must then take a screenshot of the rendered page and save it to another Amazon S3 bucket.

The extension needs to keep track of which captures succeeded and which failed. Every n number of urls, it needs to send these success/fail statuses to another RESTful web service as a JSON object.

When the extension is finished processing/trying its batch of urls, it needs to send its remaining success/fail statuses and request another list of urls from the original web service.

**Tricky Parts**

1. Getting the page source post-javascript render like "inspect element" rather than "view source".

2. Getting a screenshot of the entire page rather than the visible window.

**Web Service Interface**

The GetUrls method will look similar to this:

GetUrls() ? JSON(List<string url, guid url_id>, Settings<AmazonCredentials, bool doScreenshot, string screenshotFormat, string screenshotQuality>)

You can pretty much define what you need, and we'll make GetUrls return it to you. The point is, everything that is configurable will come down from the server except for the web service url. That will be configured in the text file.

The url_id will be Globally Unique and should be used as the bucket name to store the HTML and the Screenshot image.

The SendStatus method will look similar to this:

SendStatus(JSONList<guid url_id, byte HTMLStatus, byte ScreenshotStatus>)

**...See the Detailed Description for more...**

## Deliverables

I need a Chrome extension that pulls a JSON list of urls from a RESTful web service url (to be defined later). It must then navigate to each url (processing it's javascript), then save the HTML (not necessarily equivalent to view source, but the post-javascript HTML) to an Amazon S3 bucket. It must then take a screenshot of the rendered page and save it to another Amazon S3 bucket.

The extension needs to keep track of which captures succeeded and which failed. Every n number of urls, it needs to send these success/fail statuses to another RESTful web service as a JSON object.

When the extension is finished processing/trying its batch of urls, it needs to send its remaining success/fail statuses and request another list of urls from the original web service.

**Tricky Parts**

1. Getting the page source post-javascript render like "inspect element" rather than "view source".

2. Getting a screenshot of the entire page rather than the visible window.

**Web Service Interface**

The GetUrls method will look similar to this:

GetUrls() ? JSON(List<string url, guid url_id>, Settings<AmazonCredentials, bool doScreenshot, string screenshotFormat, string screenshotQuality>)

You can pretty much define what you need, and we'll make GetUrls return it to you. The point is, everything that is configurable will come down from the server except for the web service url. That will be configured in the text file.

The url_id will be Globally Unique and should be used as the bucket name to store the HTML and the Screenshot image.

The SendStatus method will look similar to this:

SendStatus(JSONList<guid url_id, byte HTMLStatus, byte ScreenshotStatus>)

The GetSettings method will look similar to this:

Web service urls need to be configurable variables by modifying a text file, perhaps the manifest.

**Screenshot Specs**

The screenshot functionality should be able to capture the entire page (rather than the visible window).

The file format and quality must be configurable via a text file.

**S3 Specs**

Amazon S3 credentials will come down in the GetUrls request.

For the purposes of development, you'll need to factor in a small cost for testing S3 storage. Probably well under $10.

**Non Functional Requirments**

Factor a small amount of scope creep into your bid.

Code needs to be very self-documenting and well commented. Factor in the time necessary to clean up your code assuming that somebody else will be reading it. Maintainability is very important here. I will be using this code as part of a larger project, and will need to tweak it after you have written it.

Skills: Amazon Web Services, Apple Safari, C# Programming, Google Chrome, Javascript, PHP, Script Install, Shell Script, Software Architecture, Software Testing

See more: unique service web development, string processing in c, render a service, javascript look for a string in a string, cost for php page development, clean up javascript, amazon quality code, amazon chrome extension, restful, render service, manifest, make extension, extension chrome, chrome javascript, bool, amazon store list, javascript track time page, html json list, php restful web, chrome html source

About the Employer:
( 47 reviews ) United States

Project ID: #3419775