Comprehension of the undervalued machine learning algorithm + simulation (with GUI) using Python

So here we are diving into the world of data mining this time, let’s begin with a small but informative definition;

What is data mining ?!

It’s technically a profound dive into datasets searching for some correlations, rules, anomaly detection and the list goes on. It’s a way to do some simple but effective machine learning instead of doing it the hard way like using regular neural networks or the ultimate complex version that is convolutions and recurrent neural networks (we will definitely go through that thoroughly in future articles).

Data mining algorithms vary from one to another, each one has it’s own privileges and disadvantages, i will not go through that in this article but the first one you should focus on must be the classical **Apriori Algorithm **as it is the opening gate to the data mining world.

But before going any further, there’s some special data mining vocabulary that we need to get familiar with :

  • **k-Itemsets : **an itemset is just a set of items, the k refers to it’s order/length which means the number of items contained in the itemset.
  • **Transaction : **it is a captured data, can refer to purchased items in a store. Note that Apriori algorithm operates on datasets containing thousands or even millions of transactions.
  • Association rule : an antecedent → consequent relationship between two itemsets :

#apriori-algorithm #data-mining #python #machine-learning #simulation

Data Mining — A Focus on Apriori Algorithm
1.30 GEEK