Grouping and aggregating data using SQL

Introduction

I have been so enjoying engaging with my friends to share my love of data science. We are powering through learning SQL, with three lessons under our belt already. If you would like to start at the beginning here is a link to the first lesson. All of the lessons can also be found here. Otherwise, get stuck into this lesson on grouping and aggregating data using SQL.

Previous lesson

Last week, we covered filtering data using SQL. Using one of my favourite shows, Charmed for example data, we went through using WHERE clauses in queries. In addition, we explored the use of the keywords IN, AND, OR, LIKE, BETWEEN and NOT.

This lesson

Now that we know how to filter data, we will move onto aggregation. We will learn how to use the keywords MIN and MAX to find the minimum and maximum of our data respectively. We will also practice using the COUNT, AVG and SUM keywords in a similar manner.This will also be the first lesson that we have encountered NULL values. So we will learn how to deal with them in our datasets.It just so happens that all of the friends I am teaching data science to, whilst self-isolating are women. So what better example to use in this lesson than the Spice Girls to celebrate a bit of girl power. We will use data on the first Spice Girls album titled Spice. This is one that was on high rotation in my house growing up.

Key learnings:

  • use the keyword MIN to find the minimum value in a columnuse the keyword MAX to find the maximum value in a columnuse the keyword COUNT to count the number of rows in a column or tableuse the keyword AVG to find the mean of a numerical columnuse the keyword SUM to find the total of a numerical column when all the values are added togetheruse the keyword GROUP BY to group by a column in a tableknow how NULL values will be handled in each of the above methodsunderstand how aliases work and how to use the AS keyword to create them

The problem

I think that the first Spice Girls album is the best one. Just in case people don’t believe me, I want to find some data to back it up. Using Australian charts data for the singles released from each Spice Girls album, I think I can prove the first one rules!

The data

This dataset contains information on all the tracks in each album that the Spice Girls released. The table contains the length of each track, when each single was released, what position the song peaked at in the Australian charts and how many weeks the song was in the Australian charts. I didn’t include the greatest hits album because I thought that would confuse matters.

Image for post

I got the data for this table from the Spice Girls discography Wikipedia page, the Spice(album) Wikipedia page, Spice World(album) Wikipedia page, Forever(Spice Girls Album) and on an Australian charts website.

Syntax to aggregate data

In this lesson I will teach you how to use the aggregating keywords MIN, MAX, COUNT, SUM and AVG in the SELECT statement. These aggregators may also be used elsewhere in queries. For example, they can be used in a HAVING clause, but that is beyond the scope of this lesson. I have written another story comparing WHERE and HAVING in SQL in case you want to learn more.The MAX keyword can be used to find the maximum value within a column. It can be used on many different datatypes including integers, floats, strings and dates.

#data-science #programming #ldswsd #sql #technology #data analysis

What is GEEK

Buddha Community

Grouping and aggregating data using SQL
Cayla  Erdman

Cayla Erdman

1594369800

Introduction to Structured Query Language SQL pdf

SQL stands for Structured Query Language. SQL is a scripting language expected to store, control, and inquiry information put away in social databases. The main manifestation of SQL showed up in 1974, when a gathering in IBM built up the principal model of a social database. The primary business social database was discharged by Relational Software later turning out to be Oracle.

Models for SQL exist. In any case, the SQL that can be utilized on every last one of the major RDBMS today is in various flavors. This is because of two reasons:

1. The SQL order standard is genuinely intricate, and it isn’t handy to actualize the whole standard.

2. Every database seller needs an approach to separate its item from others.

Right now, contrasts are noted where fitting.

#programming books #beginning sql pdf #commands sql #download free sql full book pdf #introduction to sql pdf #introduction to sql ppt #introduction to sql #practical sql pdf #sql commands pdf with examples free download #sql commands #sql free bool download #sql guide #sql language #sql pdf #sql ppt #sql programming language #sql tutorial for beginners #sql tutorial pdf #sql #structured query language pdf #structured query language ppt #structured query language

Siphiwe  Nair

Siphiwe Nair

1620466520

Your Data Architecture: Simple Best Practices for Your Data Strategy

If you accumulate data on which you base your decision-making as an organization, you should probably think about your data architecture and possible best practices.

If you accumulate data on which you base your decision-making as an organization, you most probably need to think about your data architecture and consider possible best practices. Gaining a competitive edge, remaining customer-centric to the greatest extent possible, and streamlining processes to get on-the-button outcomes can all be traced back to an organization’s capacity to build a future-ready data architecture.

In what follows, we offer a short overview of the overarching capabilities of data architecture. These include user-centricity, elasticity, robustness, and the capacity to ensure the seamless flow of data at all times. Added to these are automation enablement, plus security and data governance considerations. These points from our checklist for what we perceive to be an anticipatory analytics ecosystem.

#big data #data science #big data analytics #data analysis #data architecture #data transformation #data platform #data strategy #cloud data platform #data acquisition

Gerhard  Brink

Gerhard Brink

1620629020

Getting Started With Data Lakes

Frameworks for Efficient Enterprise Analytics

The opportunities big data offers also come with very real challenges that many organizations are facing today. Often, it’s finding the most cost-effective, scalable way to store and process boundless volumes of data in multiple formats that come from a growing number of sources. Then organizations need the analytical capabilities and flexibility to turn this data into insights that can meet their specific business objectives.

This Refcard dives into how a data lake helps tackle these challenges at both ends — from its enhanced architecture that’s designed for efficient data ingestion, storage, and management to its advanced analytics functionality and performance flexibility. You’ll also explore key benefits and common use cases.

Introduction

As technology continues to evolve with new data sources, such as IoT sensors and social media churning out large volumes of data, there has never been a better time to discuss the possibilities and challenges of managing such data for varying analytical insights. In this Refcard, we dig deep into how data lakes solve the problem of storing and processing enormous amounts of data. While doing so, we also explore the benefits of data lakes, their use cases, and how they differ from data warehouses (DWHs).


This is a preview of the Getting Started With Data Lakes Refcard. To read the entire Refcard, please download the PDF from the link above.

#big data #data analytics #data analysis #business analytics #data warehouse #data storage #data lake #data lake architecture #data lake governance #data lake management

Sasha  Lee

Sasha Lee

1624589676

SQL for Data Science

Currently, the demand for the skill of SQL is on the rise. Most of the jobs describe their skill requirements, and while doing that, they mention the knowledge in SQL specifically. As the name suggests, ‘Data Science’ is data-driven. Thus, SQL will be an integral part of any data science job. It is also because of the advantage that it offers among the other alternatives. This article tries to elaborate on why SQL and querying are essential for data science and related roles.

If you want to learn SQL for data science, then you can start your journey here!

Structured Query Language, acronymized to SQL, is a computer programming language aimed and designed to manipulate data warehoused in RDBMSs, i.e., Relational Database Management Systems. Different functions such as insertion, deletion, updating, modification of data can be done using SQL. Since most of the structured data is stored in RDBMSs, working with data science will necessarily involve RDBMS and, hence, SQL.

With the advent of big data, data warehousing using relational database management systems has gained more importance, and it is strictly necessary to use them. Moreover, traditionally along with the programming languages Python and R, SQL is used. For instance, a data scientist can write an SQL query to extract data from a database, on which further analyses can be made using Python or R.

If you want to become a data scientist, then you can start your journey here!

Why SQL for Data Science?

Data Science is simply the analysis and study of data to extract meaningful insights. SQL comes into the picture in two of the most critical steps of a data science cycle — Data Extraction, the pre-processing step, as mentioned in the introduction, and Machine Learning. Most of the database platforms are designed using SQL, as it has become a standard for database systems. Also, it is easy to communicate with databases with complex instructions and manipulate data.

Modern systems such as Hadoop, Spark use SQL to maintain relational database systems and to process structured data. Identification of suitable data sources and pre-processing are the key steps in any data analysis work. Since the data is stored in relational databases, querying to extract the data without copying the entire database is necessary as it saves time and is efficient. Hence, a data scientist needs to have comprehensive knowledge in querying language, SQL.

Importance of SQL

SQL is a comprehensive language with several functions, statements, and operators that pave the way to seamless data extraction. SQL has multiple reasons to assert its importance and relevance in data science. First of all, even though SQL has a wide range of tools available, learning them is not an arduous task, as the commands and queries in SQL are comparable to simple English. For example, consider the SQL query ‘select name, nationality from employee’, which can be comprehended by any person of its function with its simplicity of language. Thus, a data science novice can quickly learn SQL, unlike the other programming languages that require more conceptual understanding.

#data-analysis #data #data-science #sql #data-visualization #sql

Siphiwe  Nair

Siphiwe Nair

1625133780

SingleStore: The One Stop Shop For Everything Data

  • SingleStore works toward helping businesses embrace digital innovation by operationalising “all data through one platform for all the moments that matter”

The pandemic has brought a period of transformation across businesses globally, pushing data and analytics to the forefront of decision making. Starting from enabling advanced data-driven operations to creating intelligent workflows, enterprise leaders have been looking to transform every part of their organisation.

SingleStore is one of the leading companies in the world, offering a unified database to facilitate fast analytics for organisations looking to embrace diverse data and accelerate their innovations. It provides an SQL platform to help companies aggregate, manage, and use the vast trove of data distributed across silos in multiple clouds and on-premise environments.

**Your expertise needed! **Fill up our quick Survey

#featured #data analytics #data warehouse augmentation #database #database management #fast analytics #memsql #modern database #modernising data platforms #one stop shop for data #singlestore #singlestore data analytics #singlestore database #singlestore one stop shop for data #singlestore unified database #sql #sql database