Find Jobs
Hire Freelancers

Parquet is more space efficient than JSON/CSV

$2-8 USD / hour

In Progress
Posted over 1 year ago

$2-8 USD / hour

It is generally said that the parquet format is better in terms of storage than JSON and CSV. The first link below says "Apache Parquet is a columnar file format that provides optimizations to speed up queries and is a far more efficient file format than CSV or JSON". [login to view URL] to an external site. [login to view URL] to an external site. Now, let us try to demonstrate this. Download this CSV file (with 50,000 rows). [login to view URL] to an external site. Load the file as dataframe in Spark and save the dataframe again in JSON and Parquet format and check their file sizes. Do you see differences in file sizes? Report here. Parquet is supposed to run faster than CSV. Show one query result to demonstrate that (such as finding the number of unique values in a certain column or so).
Project ID: 35261383

About the project

1 proposal
Remote project
Active 1 yr ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
✔✔✔✔ Nice to see your posting ✔✔✔✔ Hi, sir. I read your job posting and I am interested in Parquet. SO what I have to do? Please tell me via chat. I hope to work with you. Best regards. Thanks!
$8 USD in 40 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
kansas, United States
5.0
1
Payment method verified
Member since Oct 26, 2022

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.