Power BI reading Parquet from a Data Lake

Power BI reading Parquet from a Data Lake

Data Lakes are becoming more usual every day and the need for tools to query them also increases. While writing about querying a data lake using **Synapse**, I stumbled upon a **Power BI** feature I didn’t know was there. When reading from a...

Data Lakes are becoming more usual every day and the need for tools to query them also increases.

While writing about querying a data lake using Synapse, I stumbled upon a Power BI feature I didn’t know was there.

When reading from a data lake, each folder is like a table. We store in the folder many files with the same structure, each file containing a piece of the data.

Data Lake tools are prepared to deal with the data on this way and read the files transparently for the user, but Power BI required us to read one specific file, not the folder. That’s until last November. If we google (verb: To google) about Power BI and Parquet files we can find many work arounds to read Parquet files in Power BI, but no mention to the new Parquet connector released on last November (https://powerbi.microsoft.com/en-us/blog/whats-new-in-power-query-dataflows-november-2020/), so I had to write about it.

The feature I’m illustrating on this article is in fact a combination of two features:

  • The feature to combine multiple files from Azure Data Lake Gen 2 storage. This was in preview in October 2019 in is available for a while, but I was surprised I couldn’t find any article really explaining the M code used to combine the files and how to customize the code.
  • The Parquet connector is the responsible to read Parquet files and adds this feature to the Azure Data Lake Gen 2. This connector was released in November 2020.

In order to illustrate how it works, I provided some files to be used in an Azure Storage. You can download the files here. You will also need to provision a new storage account and it will need to be an Azure Data Lake Storage Gen 2.

On the examples, I will use the address https://lakedemo.dfs.core.windows.net/opendatalake/trips for the storage, but you need to replace it with the DFS endpoint of your own storage.

Let’s make a step-by step:

  1. Open Power BI
  2. Select Get Data option on the main screen
  3. Select Azure Data Lake Storage Gen2. We will test directly with one of the most efficient options

There are 3 storage options:

  • Azure Blob Storage
  • Data Lake Storage Gen 1
  • Azure Data Lake Storage Gen 2

It’s important to choose the correct option according your storage type, this affects the performance.

blogs uncategorized azure data lake power bi

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Microsoft Power BI Consulting | Power BI Solutions in India

Perform Data Analysis & Report Creation with our Microsoft Power BI Consulting Services. Our Power BI consultants help in BI solutions and execute business requirements gathering, architecture design, implementation, user training, cloud management, report enhancements projects, etc

Hire Power BI Developer | Microsoft Power BI consultants in India

Hire our expert Power BI consultants to make the most out of your business data. Our power bi developers have deep knowledge in Power BI data modeling, structuring, and analysis. 16+ Yrs exp | 2500+ Clients| 450+ Team

Getting Started With Data Lakes

In this post, we'll learn Getting Started With Data Lakes.<br><br> This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that's designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You'll also explore key benefits and common use cases.

power bi course | power bi training | OnlineITGuru

Our Power BI Training will provide you to learn the Power BI tools and making reports easily with realty. Our Power BI Course also includes live sessions, live Pro

Explore your JIRA Data with Power BI

JIRA Software provides bug tracking, issue tracking, and project management capabilities for teams and organizations. The JIRA content pack for Power BI helps you quickly import JIRA data so you can get an instant dashboard to analyze workloads...