Learn Database Administration - PostgreSQL Database Administration (DBA) for Beginners

In this video, we will go over the basics of the PostgreSQL. We will cover topics ranging from installations, to writing basic queries and retrieving data from tables. We will also explore the logic of joining tables to retrieve data and much more.

The course also covers the basics of creating tables, storing data with data types, and working with expressions, operators, and strings.

Topics also includes:

Installing PostgreSQL

Loading sample database

Creating database and tables

Performing CRUD operations

Joining Tables

Using aggregate and analytic functions

Creating views and triggers

What you’ll learn

Install PostgreSQL Server

Load sample database

Create a database

Create a table

Insert data into tables

Update existing records inside a table

Delete Records in a table

Remove duplicate records

Query data from a table

Create a subquery

Get data from multiple tables

Create and manage roles

Create a view

Create tablespace

Backup and restore database

Filter and sort data

Use various operators

Use aggregate and analytic functions

Create triggers

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

The Complete SQL Bootcamp

The Ultimate MySQL Bootcamp: Go from SQL Beginner to Expert

An Introduction to Queries in PostgreSQL

Build a Basic App with Spring Boot and JPA using PostgreSQL

Why We Moved From NoSQL MongoDB to PostgreSQL?

What is CRUD? | CRUD Operations with SQL and PostgreSQL

What is CRUD? | CRUD Operations with SQL and PostgreSQL

What is CRUD? | CRUD Operations with SQL and PostgreSQL - In this course we will be using SQL and PostgreSQL to perform CRUD operations

In this course we will be using SQL and PostgreSQL to perform CRUD operations

PostgreSQL is a very popular , advanced, open-source object-relational database management system used by a lot of organizations. It is a very robust database management system.

Any software or a web application will typically do these set of operations called C.R.U.D.

CRUD Stands for:

  • Create (Insert)
  • Read (Select)
  • Update
  • Delete

What we will learn include:

  • How to install PostgreSQL Database Server
  • How to Load a sample database into PostgreSQL Server
  • How to Create a database and table
  • How to insert data into a table
  • How to query and retrieve data from a table
  • How to update existing data inside a table
  • How to delete data from a table
  • How to sort retrieved data from a table
  • How to Filter data using WHERE clause
  • How to remove duplicate data
  • How to use subqueries to query and retrieve data
  • How to group data using GROUP BY clause
  • How to use the HAVING clause to group data

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about CRUD, SQL and PostgreSQL

How to build a CRUD Web App with Angular 8.0

Laravel 5.8 CRUD Tutorial With Example Step By Step

MEAN Stack Angular 8 CRUD Web Application

Build a Basic CRUD App with Node and React

Learn Database Administration - PostgreSQL Database Administration (DBA) for Beginners

How to write SQL queries in PostgreSQL

How to write SQL queries in PostgreSQL

How to write SQL queries in PostgreSQL

In this tutorial, you will learn how to write simple SQL queries in PostgreSQL.

Being able to query the relational database systems is a must-have skill for a data scientist. SQL or Structured Query Language lets you do this in a very efficient way. SQL not only enables you to you ask meaningful questions to the data but also allows you to you play with the data in many different ways. Without databases, practically no real-world application is possible. So, the knowledge of databases and being able to handle them are crucial parts of a data scientist's toolbox.

Quick fact: SQL is also called SE-QU-EL. It has got some historical significance - the initial name of SQL was Simple English Query Language.

Generally, relational databases look like the following -

Relations are also called tables. There are a number of ways in which databases can be represented. This is just one of them and the most popular one.

This tutorial introduces the four most common operations performed with SQL, and they are Create, Read, Update and Delete. Collectively these four operations are often referred to as CRUD. In any application that involves user interaction when these four operations are always there.

You will be using PostgreSQL as the relational database management system. PostgreSQL is very light-weight, and it is free as well. In this tutorial, you will

  • Get up and running with PostgreSQL
  • Connect to a PostgreSQL database
  • Create, read, update and delete tables in that database
  • Run SQL on Jupyter Notebook
  • Run SQL in Python

Let's get started.

Getting up and running with PostgreSQL

PostgreSQL is a light-weight and an open source RDBMS. It is extremely well accepted by the industry. You can learn more about PostgreSQL from its official website.

To be able to start writing and executing queries in PostgreSQL, you will need it installed on your machine. Installing it is extremely easy. The following two short videos show you how PostgreSQL can be downloaded and installed on a 32-bit Windows-7 machine

Note: While you are installing PostgreSQL take note of the password and port number that you are entering.

Once you have installed PostgreSQL successfully on your machine, open up pgAdmin. pgAdmin is a handy utility which comes with the PostgreSQL installation, and it lets you do regular database related tasks through a nice graphical interface. pgAdmin's interface looks like

When you open up pgAdmin, you will see a server named "PostgreSQL 9.4 (localhost:5432)" enlisted in the interface

Note: Your version may be different than the above and so the port number (5432).

Connect to the server by entering the password that you gave during the installation. For reference - https://bit.ly/2FPO4hR.

Once you have successfully connected to the local database server, you will get an interface similar to the following

CRUD operations in PostgreSQL

Creating a table according to a given specification -

To be able to operate on a database you will need a table. So let's go ahead and create a simple table (also called relation) called datacamp_courses with the following specification (schema)

The specification gives us quite a few information on the columns of the table

  • The primary key of the table should be course_id (note that only this one is bold) and its data-type should be an integer. A primary key is a constraint which enforces the column values to be non-null and unique. It lets you uniquely identify a specific or a set of instanced present in the table.
  • Rest of the information in the specification should be easy to interpret now.

To create a table, right-click on the newly created database DataCamp_Courses and select CREATE Script from the options. You should get something similar to the following

Let's execute the following query now

CREATE TABLE datacamp_courses(
 course_name VARCHAR (50) UNIQUE NOT NULL,
 course_instructor VARCHAR (100) NOT NULL,
 topic VARCHAR (20) NOT NULL

For executing the query just select it and click the execute button from the menu bar

The output should be

The general structure of a table creation query in PostgreSQL looks like

CREATE TABLE table_name (
 column_name TYPE column_constraint,
 table_constraint table_constraint

We did not specify any table_constraints while creating the table. That can be avoided for now. Everything else is quite readable except for the keyword SERIAL. Serial in PostgreSQL lets you create an auto-increment column. By default, it creates values of type integer. Serial frees us from the burden of remembering the last inserted/updated primary key of a table, and it is a good practice to use auto-increments for primary keys. You can learn more about serial from here.

Inserting some records to the newly created table

In this step, you will insert some records to the table. Your records should contain

  • A course name
  • Instructor's name of the course
  • Course topic

The values for the column course_id will be handled by PostgreSQL itself. The general structure of an insert query in PostgreSQL looks like

INSERT INTO table(column1, column2, …)
 (value1, value2, …);

Let's insert some records

INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Deep Learning in Python','Dan Becker','Python');

INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Joining Data in PostgreSQL','Chester Ismay','SQL');

Note that you did not specify the primary keys explicitly. You will see its effects in a moment.

When you execute the above two queries, you should get the following result upon successful insertions

Query returned successfully: one row affected, 11 ms execution time.

Reading/viewing the data from the table -

This is probably something you will do a lot in your data science journey. For now, let's see how is the table datacamp_courses holding up.

This is generally called a select query, and the generic structure of a select query looks like


Let's select all the columns from the table datacamp_courses

SELECT * FROM datacamp_courses;

And you get

Note the primary keys now. If you want to just see the names of the courses you can do so by

SELECT course_name from datacamp_courses;

And you get

You can specify as many column names as possible which you may want to see in your results provided they exist in the table. If you run select course_name, number_particpants from datacamp_courses; you will run into error as the column number_particpants does exist in the table. You will now see how you can update a specific record in the table.

Updating a record in the table

The general structure of an update query in SQL looks like the following:

UPDATE table
SET column1 = value1,
    column2 = value2 ,...

You are going to update the record where course_instructor = "Chester Ismay" and set the course_name to "Joining Data in SQL". You will then verify if the record is updated. The query for doing this would be

UPDATE datacamp_courses SET course_name = 'Joining Data in SQL'
WHERE course_instructor = 'Chester Ismay';

Let's see if your update query had the intended effect by running a select query

You can see your update query performed exactly in the way you wanted. You will now see how you can delete a record from the table.

Deleting a record in the table

The general structure of a delete query in SQL looks like following:

WHERE condition;

You are going to delete the record where course_name = "Deep Learning in Python" and then verify if the record is deleted. Following the structure, you can see that the following query should be able to do this

DELETE from datacamp_courses
WHERE course_name = 'Deep Learning in Python';

Keep in mind that the keywords are not case-sensitive in SQL, but the data is case-sensitive. This is why you see a mixture of upper case and lower case in the queries.

Let's see if the intended record was deleted from the table or not

And yes, it indeed deleted the intended record.

The generic structures of the queries as mentioned in the tutorial are referred from postgresqltutorial.com.

You now know how to basic CRUD queries in SQL. Some of you may use Jupyter Notebooks heavily and may be thinking it would be great if there were an option to execute these queries directly from Jupyter Notebook. In the next section, you will see how you can achieve this.

SQL + Jupyter Notebooks

To be able to run SQL queries from Jupyter Notebooks the first step will be to install the ipython-sql package.

If it is not installed, install it using:

pip install ipython-sql

Once this is done, load the sql extension in your Jupyter Notebook by executing

%load_ext sql

The next step will be to connect to a PostgreSQL database. You will connect to the database that you created -DataCamp_Courses.

For being able to connect to a database that is already created in your system, you will have to instruct Python to detect its dialect. In simpler terms, you will have to tell Python that it is a PostgreSQL database. For that, you will need psycopg2 which can be installed using:

pip install psycopg2

Once you installed psycopg connect to the database using

%sql postgresql://postgres:[email protected]:5432/DataCamp_Courses
'Connected: [email protected]_Courses'

Note the usage of %sql. This is a magic command. It lets you execute SQL statements from Jupyter Notebook. What follows %sql is called a database connection URL where you specify

  • Dialect (postgres)
  • Username (postgres)
  • Password (postgres)
  • Server address (localhost)
  • Port number (5432)
  • Database name (DaaCamp_Courses)

You can now perform everything from you Jupyter Notebook that you performed in the pgAdmin interface. Let's start by creating the table datacamp_courses with the exact same schema.

But before doing that you will have to drop the table as SQL won't let you store two tables with the same name. You can drop a table by

%sql DROP table datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses


The table datacamp_courses is now deleted from PostgreSQL and hence you can create a new table with this name.

CREATE TABLE datacamp_courses(
 course_name VARCHAR (50) UNIQUE NOT NULL,
 course_instructor VARCHAR (100) NOT NULL,
 topic VARCHAR (20) NOT NULL
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses


Note the usage of %sql %%sql here. For executing a single line of query, you can use %sql, but if you want to execute multiple queries in one go, you will have to use %%sql.

Let's insert some records

INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Deep Learning in Python','Dan Becker','Python');
INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Joining Data in PostgreSQL','Chester Ismay','SQL');
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.
1 rows affected.


View the table to make sure the insertions were done as expected

select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
2 rows affected.

Let's maintain the flow. As the next step, you will update a record in the table

%sql update datacamp_courses set course_name = 'Joining Data in SQL' where course_instructor = 'Chester Ismay';
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.


Pay close attention when you are dealing with strings in SQL. Unlike traditional programming languages, the strings values need to be wrapped using single quotes.

Let's now verify if your update query had the intended effect

select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
2 rows affected.

Let's now delete a record and verify

delete from datacamp_courses where course_name = 'Deep Learning in Python';
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.

select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.

By now you have a clear idea of executing CRUD operations in PostgreSQL and how you can perform them via Jupyter Notebook. If you are familiar with Python and if you interested in accessing your database through your Python code you can also do it. The next section is all about that.

Getting started with SQLAlchemy and combining it with SQL magic commands

For this section, you will need the SQLAlchemy package. It comes with the Anaconda distribution generally. You can also pip-install it. Once you have it installed, you can import it by -

import sqlalchemy

To able to interact with your databases using SQLAlchemy you will need to create an engine for the respective RDBMS where your databases are stored. In your case, it is PostgreSQL. SQLAlchemy lets you create an engine of the RDBMS in just a single call of create_engine(), and the method takes a database connection URL which you have seen before.

from sqlalchemy import create_engine
engine = create_engine('postgresql://postgres:[email protected]:5432/DataCamp_Courses')
print(engine.table_names()) # Lets you see the names of the tables present in the database

You can see the table named datacamp_courses which further confirms that you were successful in creating the engine. Let's execute a simple select query to see the records of the table datacamp_courses and store it in a pandas DataFrame object.

You will use the read_sql() method (provided by pandas) which takes a SQL query string and an engine.

import pandas as pd

df = pd.read_sql('select * from datacamp_courses', engine)

You can also pair up the %sql magic command within your Python code.

df_new = %sql select * from datacamp_courses
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.

Take Your SQL Skills to the Next Level with 6 PostgreSQL Tricks

Take Your SQL Skills to the Next Level with 6 PostgreSQL Tricks

PostgreSQL is one of the most popular open source SQL dialects. Here, let's look at a few PostgreSQL tricks that you can start using to take your SQL skills to the next level.

PostgreSQL is one of the most popular open source SQL dialects. One of its main advantages is the ability to extend its functionality with some inbuilt tools.

Here, let's look at a few PostgreSQL tricks that you can start using to take your SQL skills to the next level.

You'll find out how to:

  1. Quickly copy files into a database
  2. Summarise data in crosstab format
  3. Take advantage of arrays and JSON data in SQL
  4. Work with geometric data
  5. Run statistical analyses directly on your database
  6. Use recursion to solve problems
1. Copy data from a file

An easy way to quickly import data from an external file is to use the COPY function. Simply create the table you want to use, then pass in the filepath of your dataset to the COPY command.

The example below creates a table called revenue and fills it from a randomly generated CSV file.

You can include extra parameters, to indicate the filetype (here, the file is a CSV) and whether to read the first row as column headers.

You can learn more here.

CREATE TABLE revenue (
  store VARCHAR,
  year INT,
  revenue INT,
  PRIMARY KEY (product, year)

COPY revenue FROM '~/Projects/datasets/revenue.csv' WITH HEADER CSV;
2. Summarise data using the crosstab function

If you fancy yourself as a spreadsheet pro, you will probably be familiar with creating pivot tables from dumps of data. You can do the same in PostgreSQL with the crosstab function.

The crosstab function can take data in the form on the left, and summarise it in the form on the right (which is much easier to read). The example here will follow on with the revenue data from before.

First, enable the tablefunc extension with the command below:


Next, write a query using the crosstab function:

	FROM revenue
	ORDER BY 1,2'
AS summary(
	store VARCHAR, 
    "2016" INT, 
    "2017" INT, 
    "2018" INT

There are two things to consider when using this function.

  • First, pass in a query selecting data from your underlying table. You may simply select the table as it is (as shown here). However, you might want to filter, join or aggregate if required. Be sure to order the data correctly.
  • Then, define the output (in the example, the output is called 'summary', but you can call it any name). List the column headers you want to use and the data type they will contain.

The output will be as shown below:

  store  |  2016   |  2017   |  2018   
 Alpha   | 1637000 | 2190000 | 3287000
 Bravo   | 2205000 |  982000 | 3399000
 Charlie | 1549000 | 1117000 | 1399000
 Delta   |  664000 | 2065000 | 2931000
 Echo    | 1795000 | 2706000 | 1047000
(5 rows)
3. Work with arrays and JSON

PostgreSQL supports multi-dimensional array data types. These are comparable to similar data types in many other languages, including Python and JavaScript.

You might want to use them in situations where it helps to work with more dynamic, less-structured data.

For example, imagine a table describing published articles and subject tags. An article could have no tags, or it could have many. Trying to store this data in a structured table format would be unnecessarily complicated.

You can define arrays using a data type, followed by square brackets. You can optionally specify their dimensions (however, this is not enforced).

For example, to create a 1-D array of any number of text elements, you would use text[]. To create a three-by-three two dimensional array of integer elements, you would use int[3][3].

Take a look at the example below:

CREATE TABLE articles (
  tags TEXT[]

To insert arrays as records, use the syntax '{"first","second","third"}'.

INSERT INTO articles (title, tags)
  ('Lorem ipsum', '{"random"}'),
  ('Placeholder here', '{"motivation","random"}'),
  ('Postgresql tricks', '{"data","self-reference"}');

There are a lot of things you can do with arrays in PostgreSQL.

For a start, you can check if an array contains a given element. This is useful for filtering. You can use the "contains" operator @> to do this. The query below finds all the articles which have the tag "random".

FROM articles
WHERE tags @> '{"random"}';

You can also concatenate (join together) arrays using the || operator, or check for overlapping elements with the && operator.

You can search arrays by index (unlike many languages, PostgreSQL arrays start counting from one, instead of zero).

FROM articles;

As well as arrays, PostgreSQL also lets you use JSON as a data type. Again, this provides the advantages of working with unstructured data. You can also access elements by their key name.

CREATE TABLE sessions (
    session_info JSON

INSERT INTO sessions (session_info)
('{"app_version": 1.0, "device_type": "Android"}'),
('{"app_version": 1.2, "device_type": "iOS"}'),
('{"app_version": 1.4, "device_type": "iOS", "mode":"default"}');

Again, there are many things you can do with JSON data in PostgreSQL. You can use the -> and ->> operators to "unpackage" the JSON objects to use in queries.

For example, this query finds the values of the device_type key:

  session_info -> 'device_type' AS devices
FROM sessions;

And this query counts how many sessions were on app version 1.0 or earlier:

FROM sessions
WHERE CAST(session_info ->> 'app_version' AS decimal) <= 1.0;
4. Run statistical analyses

Often, people see SQL as good for storing data and running simple queries, but not for running more in-depth analyses. For that, you should use another tool such as Python or R or your favourite spreadsheet software.

However, PostgreSQL brings with it enough statistical capabilities to get you started.

For instance, it can calculate summary statistics, correlation, regression and random sampling. The table below contains some simple data to play around with.

  x INT,
  y INT

INSERT INTO stats (x,y)
  (1,2), (3,4), (6,5), (7,8), (9,10);

You can find the mean, variance and standard deviation using the functions below:

FROM stats;

You can also find the median (or any other percentile value) using the percentile_cont function:

-- median
FROM stats;

-- 90th percentile
FROM stats;

Another trick lets you calculate the correlation coefficients between different columns. Simply use the corr function.

FROM stats;

PostgreSQL lets you run linear regression (sometimes called the most basic form of machine learning) via a set of inbuilt functions.

FROM stats;

You can even run Monte Carlo simulations with single queries. The query below uses the generate_series and random number functions to estimate the value of π by randomly sampling one million points inside a unit circle.

		) / 1000000 AS pi 
5. Work with shape data

Another unusual data type available in PostgreSQL is geometric data.

That's right, you can work with points, lines, polygons and circles within SQL.

Points are the basic building block for all geometric data types in PostgreSQL. They are represented as (x, y) coordinates.

	POINT(0,0) AS "origin",
    POINT(1,1) AS "point";

You can also define lines. These can either be infinite lines (specified by giving any two points on the line). Or, they can be line segments (specified by giving the 'start' and 'end' points of the line).

	LINE '((0,0),(1,1))' AS "line",
    LSEG '((2,2),(3,3))' AS "line_segment";

Polygons are defined by a longer series of points.

	POLYGON '((0,0),(1,1),(0,2))' AS "triangle",
	POLYGON '((0,0),(0,1),(1,1),(1,0))' AS "square",
    POLYGON '((0,0),(0,1),(2,1),(2,0))' AS "rectangle";

Circles are defined by a central point and a radius.

	CIRCLE '((0,0),1)' as "small_circle",
    CIRCLE '(0,0),5)' as "big_circle";

There are many functions and operators that can be applied to geometric data in PostgreSQL.

You can:

  • Check if two lines are parallel with the ?|| operator:
	LINE '((0,0),(1,1))' ?|| LINE '((2,3),(3,4))';
  • Find the distance between two objects with the <-> operator:
	POINT(0,0) <-> POINT(1,1);
  • Check if two shapes overlap at any point with the && operator:
	CIRCLE '((0,0),1)' &&  CIRCLE '((1,1),1)';
  • Translate (shift position) a shape using the + operator:
	POLYGON '((0,0),(1,2),(1,1))' + POINT(0,3);

And lots more besides - check out the documentation for more detail!

6. Use recursive queries

Recursion is a programming technique that can be used to solve problems using a function which calls itself. Did you know that you can write recursive queries in PostgreSQL?

There are three parts required to do this:

  • First, you define a starting expression.
  • Then, define a recursive expression that will be evaluated repeatedly
  • Finally, define a "termination criteria" - a condition which tells the function to stop calling itself, and return an output.

The query below returns the first hundred numbers in the Fibonacci sequence:

WITH RECURSIVE fibonacci(n,x,y) AS (
    	1 AS n ,
  		0 :: NUMERIC AS x,
    	1 :: NUMERIC AS y
    	n + 1 AS n,
  		y AS x,
    	x + y AS y 
  	FROM fibonacci 
  	WHERE n < 100
FROM fibonacci;

Let's break this down.

First, it uses the WITH clause to define a (recursive) Common Table Expression called fibonacci. Then, it defines an initial expression:

WITH RECURSIVE fibonacci(n,x,y) AS (
    	1 AS n ,
  		0 :: NUMERIC AS x,
    	1 :: NUMERIC AS y...

Next, it defines the recursive expression that queries fibonacci:

    	n + 1 AS n,
  		y AS x,
    	x + y AS y 
  	FROM fibonacci...

Finally, it uses a WHERE clause to define the termination criteria, and then selects column x to give the output sequence:

...WHERE n < 100
	FROM fibonacci;

Perhaps you can think of another example of recursion that could be implemented in PostgreSQL?

Final remarks

So, there you have it - a quick run through of some great features you may or may not have known PostgreSQL could provide. There are no doubt more features worth covering that didn't make it into this list.

PostgreSQL is a rich and powerful programming language in its own right. So, next time you are stuck figuring out how to solve a data related problem, take a look and see if PostgreSQL has you covered. You might surprised how often it does!

Thanks for reading!