Python script to combine CSVs

In Progress Posted Sep 26, 2012 Paid on delivery
In Progress Paid on delivery

This project is to write a python script and turn it into a .exe that can run on windows 7 or later, and a .app that can run on mac OSX 10.4 or later. The python script, the .exe, and the .app are all deliverables.

OVERVIEW

We are a renewable energy company that works in buildings. In every building we work in, we get dozens of .CSV, .XLS, and .XLSX files that need to be combined into a single CSV. Each file has differing header information in the top in multiple rows (not necessarily just 2, it could be 20 rows of header info), then multiple columns of data beneath the header information. Some files may only have two columns of data, whereas other files may have 20 or more columns of data.

Each file also has a column (or two) of date/time stamps, showing the date and time each measurement in a row was taken. The time stamps from different files do not necessarily start at the same time – e.g. one may start at 3pm on July 24th, whereas another may start at 7pm on July 27th. Furthermore, the time stamps do not increment in the same step size – e.g. one may increase by five minutes per row, one may increase by 15 minutes per row, and a third may increase randomly per row.

Here is what the program must do:

• Read every CSV, XLS, and XLSX file in whatever folder the .app or .exe is placed. This way we can copy the app to a new folder full of csvs to combine.

• Combine all the CSVs, XLS, and XLSX files into one CSV, with header information from each individual file and the name of that file preserved above the columns from that file.

• Make a master time stamp column on the left side of the document:

o Find the smallest increment of any date/time stamp column – e.g. if there are time stamps incrementing at 1 minute, 5 minutes, and random minutes, make the master column of date/time stamps on the left increment at 1 minute per row.

o The master date/time stamp should start with the date/time of the earliest data point from any file, and end at the date/time of the latest data point from any column.

• Line up and space out all data columns so they correspond correctly to the master time stamp on the left.

• Delete redundant date/time stamp columns so there is only the master date/time stamp column on the left and no other date/time columns

• Make sure all header information is preserved over the proper columns

o Add in the name of the original file above the columns!

• Write everything to a new CSV called “[url removed, login to view]”

I’ve uploaded examples of data files of the type that will need to be combined, as well as an example of the final output file needed so you can see what I’m talking about. The examples are "14836...", "18102", and "condensate pump". The example output file is "combined_data". (Please note, in “[url removed, login to view]", columns AD through the end are NOT empty. There is data in row 6615, for example.)

Thank you for your help, and I look forward to working with you!

Best,

Brenden

Data Processing Excel Python Software Architecture

Project ID: #2518755

About the project

13 proposals Remote project Active Oct 2, 2012

Awarded to:

charlesmingus

I'm a python programmer, working mostly as a freelancer. I can have the project done in 4 days. Contact me for more information.

$30 USD in 4 days
(1 Review)
0.0

13 freelancers are bidding on average $245 for this job

samitXI

Please check your inbox. Thanks

$170 USD in 3 days
(70 Reviews)
6.3
dobreiiita

Hi, I have experience in managing csv, xls, xlsx files programmatically in JAVA which will be cross platform so will work on mac, windows and linux. Please let me know if you are interested in JAVA application with u More

$150 USD in 3 days
(59 Reviews)
5.8
eliezedeck

Hi, please check your PMB. Thanks

$250 USD in 3 days
(6 Reviews)
5.4
kailashbuki

Hi, i am well versed with python and mainly text processing tasks. I also have prior experience working in such projects. I can guarantee you a good quality product within provided time frame. :)

$200 USD in 4 days
(8 Reviews)
4.6
profyguy

can do it. have great exprience in combining csv. wrked for a long time with realty company. alex.

$250 USD in 7 days
(9 Reviews)
4.0
mbewley

Hi Brendan, I've built several apps in python (and turned them into exe files and windows installers), that read and write csv files, and do a bunch of tabular data processing in between. I've been working with scient More

$320 USD in 7 days
(1 Review)
1.1
magicweb

Python Expert.

$500 USD in 15 days
(0 Reviews)
0.0
EnsoftDev

We can really help you with this task. We have experience in industrial data manipulation (we have worked in this kind of project for a big alarm company), focus on code quality and expertise in python programs.

$450 USD in 7 days
(0 Reviews)
0.0
ProffesorMadhead

This project can be done easily. i have a mac expert at my disposal as well as a graphics designer if you want the end user to be more than your in house team. im currently unemployed and available 24/7 on skype. so More

$200 USD in 3 days
(0 Reviews)
0.0
thkatsou

Heavy experience in text processing using python and other languages..

$250 USD in 3 days
(0 Reviews)
0.0