For the past several years, I have been using all kinds of data formats in Big Data projects. During this time I have strongly favored one format over other — my failures have taught me a few lessons. During my lectures I keep stressing the importance of using the correct **Data Format **for the correct purpose — it makes a world of a difference.
All this time I have wondered whether I am delivering the right knowledge to my customers and students. Can I support my claims using data? Therefore I decided to do this performance comparison.
Before I start the comparison let me briefly describe to you the various Data Formats that were considered.
#big-data #analytics #data #data-science #artificial-intelligence
Data Lake -Comparing Performance of Known Big Data Formats. Performance Comparison of well known Big Data Formats — CSV, JSON, AVRO, PARQUET & ORC