Data Analysis using AWS redshift, matilion, kensis streams etc

Cancelled Posted 5 years ago Paid on delivery
Cancelled Paid on delivery

Two Data engineering/data analytic scenrio based task

Task 1 - Process

What I’d like One to come up with is 5-ish slides on the process and steps you would take in the following situation

A retailer,has agreed to do business with our company,We’ve not worked with them previously and do not know what their data is like. The work will involve product advertising on their website. What we need to do is link this advertising back to their sales data (which will share a common userid with the advertising data) and report in two areas:

[login to view URL] reporting including sales uplift

[login to view URL] operational reports

The data structures are as follows:

Table:Impressions

ImpressionID (PK)

CampaignID

ProductID

ImpressionDatetime

PageViewID (which page it was shown on)

Table :PageViews

PageViewID

PageName

UserID

Table:Clicks

ImpressionID

ClickDatetime

Table:AllSales

SaleID

UserID

ProductID

SaleDatetime

SaleValue

Table:Products

ProductID

ProductName

Table:Campaigns

CampaignID

ProductID

CampaignName

CampaignStartDatetime

CampaignEndDatetime

As a guidance, I’m not looking for code. What we want to see is a high-level set of steps you would go through on receipt of such a dataset (which may include questions about it), and consideration of the objectives that we’d be trying to achieve. An Entity Relationship Diagram should be part of this and a target data architecture too. A further consideration in this work is that we may want to do this for other retailers in a similar position, so repeatability and scalability are important.

Task 2 – Skills Test

What I’d like from you is an approach to cleansing/filtering streaming data. I’d like to see one (possibly two) approach(es) including:

Reasoning for choosing an approach

Considerations that go into making a decision (inc. Risks and GDPR if appropriate)

Relevant Technical Data Flow Architecture

The hypothetical scenario is as follows:

A third-party tech partner company provides us with an advertising PaaS. From their platform they will provide us with Impression and Click data via an AWS Kinesis stream. They have informed us that they’re unable to filter the data to just our instance of the Platform and will be supplying impressions and clicks from other platform users that they’ve asked us to remove from our dataset. We have a defined list of users that should be used as a whitelist for the data filtering.

Again, this should be no more than 5 slides, but should be a bit more detailed than the previous task.

Blog Install Data Analytics Graphic Design PHP Redshift Website Design

Project ID: #17433612

About the project

7 proposals Remote project Active 5 years ago

7 freelancers are bidding on average £26 for this job

dannygist

Dannygist run a team of detailed-oriented, accurate and analytical mathematicians with rigid knowledge in computational techniques, algorithm, and mathematical theory. We are empowered with mathematical skills to desig More

£10 GBP in 1 day
(0 Reviews)
0.0