This project aims to walk you through how to leverage SQL in relationship mapping and data processing with multiple datasets, and to create an automated reporting dashboard.
SQL is one of the essential languages that every data analyst should be familiar with in order to store, manipulate and retrieve data in a relational database. Indeed, there are a number of various Relational Database Management Systems (RDMS) that use SQL as their standard language, namely MySQL, Oracle, SQL Server, etc.
That being said, this project aims to boost my capability in using SQL to the max to explore and manipulate multiple datasets of an e-commerce company. It’s not just limited to data processing; I’ve decided to expand the scope of this project to scale the applicability of structured data to visualization by plugging into Data Studio to build a reporting dashboard.
If you’re keen, feel free to check out the dataset provided publicly here at Kaggle. Below is a quick summary of the key highlights in this project:
How should we handle multiple datasets that share relationships with one another?
If you refer to the database provided by Kaggle, there are 7 to 8 separate datasets that represents an e-commerce sales report: order, customer, order_items, payment, delivery, etc. Prior to data processing, we need to identify the relationship of all datasets with one another to avoid mis-mapping that may result in data loss.
Take “order” and “customer” datasets as an example, one customer can have 0 or multiple orders (one customer_id for different order_id), but one order is only tied to one customer only (one order_id for one customer_id).
SQL stands for Structured Query Language. SQL is a scripting language expected to store, control, and inquiry information put away in social databases. The main manifestation of SQL showed up in 1974, when a gathering in IBM built up the principal model of a social database. The primary business social database was discharged by Relational Software later turning out to be Oracle.
Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.
Seminal Papers in Data Science: A Relational Model for Large Shared Data Banks. 50 years later, a review of some main concepts from E.F. Codd’s 1970 paper that laid the groundwork for relational databases and SQL.
🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...
The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.