Azure Data Explorer (Kusto), is one of the most dedicated relational databases in the market. The whole system is running in SSD and memory, to offer fast and responsive data analysis. It could be a good option to serve as warm-path data storage.

Due to various reasons, such as mal-function client, imperfect data pipeline, etc. data could be ingested into Kusto multiple times. It leads to data duplication issue. This problem could be even more severe if the ingested data is **summary **data, such as overall revenue for a group of stores, etc.

Data duplication could mess-up all the following data analysis, People may make the wrong decision based on that. Therefore, data cleaning/deduplication is necessary. Before that, we need to first confirm, whether the current Kusto table having a duplication issue.

The **confirmation **step is the main focus of this article.

#data-science #python #kusto #data #database

Use Data Brick to verify Azure Explore (Kusto) Data Duplication issue
1.20 GEEK