Build Hadoop/Hive System
£1500-3000 GBP
Paid on delivery
We want to find a partner to help us get started in building a big data / datawarehousing system in Hadoop and Hive (or suggested alternative) to run alongside our operational system. This new big data store will provide api, reporting and datawarehousing functions, the latter to drive a tool like Tableau. This data store will then develop to receive streamed and historical batch data and generate metrics from map-reduce calls.
We are a real time vehicle tracking solutions provider collecting information about vehicle positions and their adherence to scheduled journeys.
We are new to the Hadoop world and want to get up to speed ourselves during this development. Therefore we want a full end to end development including initial system setup, development of a data loading/streaming method and provision of a small set of data output methods (RESTful API calls, reports and an initial datawarehouse structure ).
We envisage the following tasks :-
1. Deploy Hadoop/Hive (or suggested alternative) on a virtual Debian server which we will provide access to. We want this performed in such a way that it can easily be expanded into new nodes and would want to see data distributed across more than 1 system/node.
2. Develop a data load process to pull information from our transactional system and load into Hadoop/Hive. This initial data set will be a block of data per day containing a couple of metrics but with quite a few decriptive fields as a vehicle code, location_code. Timestamp, customer code. If this was a traditional star schema then there would be about dimensions. This data has a time based aspect and has a geospatial aspect. We would be able to provide this in Csv format or from an API call. In
3. Develop some map reduce functions to generate some useful metrics and agregations which we can agree.
4. Make some of the metrics available using the existing HADOOP/Hive/RESTful technologies in order to provide an API.
5. It would be nice from us to access the datastore from PHP using perhaps a Hive/ODBC driver not sure if this is possible but it would be good to try this.
6. Organize the data so that an OLAP tool can be used against it. For example, use Pentaho or Tableau to generate some queries to be able to pivot ad drill down. Especially important is to be able to show aggregate data for say a year and drill into month, day etc. Also would be good to be able to show geographical data.
We are interested in working with someone who can recommend the paths to take to make this system expandible, fast and easily accessible and to help us make the best choices. For example help in deciding which database to use would be welcome.
Please bid only if yu have experience of this in the past. If you interested in bidding for this week we would like to hear about your experience in similar projects and your views on whether this is a sensible approach.
Project ID: #8477793
About the project
37 freelancers are bidding on average £2680 for this job
Let's discuss over freelancer Personal Message Box for the proper estimation of cost and time. I am myself developer so you will directly work with me. No mediators. No managers. No subcontractors. see my recent More
Hello Sri Technocrat will provide fully interactive website for your project. As per the detail, Sri Technocrat will provide three template functional schemes and sample pages to make your choice for layout. It will More
Hello, My experiences with performance testing / tuning of Mapreduce or Hive jobs before 01) "For a Company with innovative medical solutions" My Job Functions: Developed MapReduce programs to parse the raw da More
Hi, I have read your requirement and i am too much interested in this job ,I have an expert team , Only 10 people and working since 5 years and developed many websites , web applications e commerce websites , plugin d More
Hi, I am a senior developer with the skills that you need for the job. I can partner with you in this project. I hope we can talk better soon. Best regards, Norberto
I have done various industrial project in Hadoop and Big Data using Map Reduce, Pig, Hive, Sqoop, Oozie, Flume, Kafka, Cassandra, Hbase, Spark for reputed company of UK, France and USA. I have total 5 years of experien More
My name is Rahul and I have 4+ years of experience working for data driven companies in Hadoop ecosystem as architect, admin, developer and analyst in devops style teams doing hadoop cluster installations, upgrades ,re More
I have experience of architecting the Business Intelligence solution for an e-governance project with possibly the largest biometric information capture globally. The solution involved creating a data-warehouse for sto More
im certified pentahp enginier and use penta pdi to insert a huge data in cloudera and made sql queries from hive or impala tables
efficient writing of Hive scripts for map reduce programs , Able to deliver documentations to support developing , testing software with test data ,capable of doing end to end project with high quality software