I am currently working for Effective Measure, who collects a huge volume of data about people surfing activities from Internet every day and provides its clients "a single version of truth" after analyzing them. Its data processing system is actually a hadoop cluster with a hive server. I gain a lot of experience about solving the problems in hadoop environment. What's more, I also have quite some experience of guiding my junior colleagues to set up a hadoop cluster, which might be very helpful in this project.