Lakehouse and the evolution of Data Lake

Lakehouse and the evolution of Data Lake

Simplifying data infrastructure and accelerating innovation. We jump into what’s Lakehouse and how you can benefit from it, let’s have a quick overview of two data management paradigms widely used nowadays.

The history of data storage starts back in the 1950s when punch cards were used for storing data generated by computers. A lot has changed since then and this article will cover one of the latest trends in the industry, Lakehouse.

Before we jump into what’s Lakehouse and how you can benefit from it, let’s have a quick overview of two data management paradigms widely used nowadays.

Data Warehouse

The architecture for Data Warehouses was developed in the 1980s to support companies in their decision-making process. The central concept relies on having historical data processed and stored in both formats, aggregated and granular.

Aggregated data contain high-level information, summarized by groups and displaying measures such as totals, averages, or sums; granular data contain information at the lowest level of detail that is relevant for the business analysis.

This data is then consumed by BI tools, where executives and other staff can visualize and analyze data in the format of reports and charts.

Data Lake

With the advent of big data, traditional architectures like the data warehouse had to be rethought. With data coming from different sources, in different formats, and usually in a bigger volume, a new paradigm needed to emerge to fill this gap. In a data lake, the data is stored in its raw format and it’s only queried when a business question arises, retrieving relevant data that can then be analyzed to help answer the question. The data is stored in cloud storage like Amazon S3, which has become one of the largest and most cost-effective storage systems in the world as it makes it possible to store practically limitless amounts of data in its native format at a low cost.

data-engineering data-warehouse data-lake database data-science

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Data Lakes Are Not Just For Big Data - DZone Big Data

A data expert discusses the three different types of data lakes and how data lakes can be used with data sets not considered 'big data.'

50 Data Science Jobs That Opened Just Last Week

Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.

Managing Data as a Data Engineer:  Understanding Data Changes

Understand how data changes in a fast growing company makes working with data challenging. In the last article, we looked at how users view data and the challenges they face while using data.

Intro to Data Engineering for Data Scientists

Intro to Data Engineering for Data Scientists: An overview of data infrastructure which is frequently asked during interviews

Managing Data as a Data Engineer — Understanding Users

Understanding how users view data and their pain points when using data. In this article, I would like to share some of the things that I have learnt while managing terabytes of data in a fintech company.