Replace values in array column with related values from another table

In my database I have a table relations with a column relation_ids containing the IDs of users (user_id). This takes the form of an array with many IDs possible, e.g.:

In my database I have a table relations with a column relation_ids containing the IDs of users (user_id). This takes the form of an array with many IDs possible, e.g.:

{111,112,156,4465}

I have another table names containing information on users such as user_id, first_name, last_name etc.

I would like to create an SQL query to return all rows from relations with all columns, but append the array column relation_ids with first_name from the names table substituted for IDs.

Is it possible as some kind of subquery?

What is CRUD? | CRUD Operations with SQL and PostgreSQL

What is CRUD? | CRUD Operations with SQL and PostgreSQL

What is CRUD? | CRUD Operations with SQL and PostgreSQL - In this course we will be using SQL and PostgreSQL to perform CRUD operations

In this course we will be using SQL and PostgreSQL to perform CRUD operations

PostgreSQL is a very popular , advanced, open-source object-relational database management system used by a lot of organizations. It is a very robust database management system.

Any software or a web application will typically do these set of operations called C.R.U.D.

CRUD Stands for:

  • Create (Insert)
  • Read (Select)
  • Update
  • Delete

What we will learn include:

  • How to install PostgreSQL Database Server
  • How to Load a sample database into PostgreSQL Server
  • How to Create a database and table
  • How to insert data into a table
  • How to query and retrieve data from a table
  • How to update existing data inside a table
  • How to delete data from a table
  • How to sort retrieved data from a table
  • How to Filter data using WHERE clause
  • How to remove duplicate data
  • How to use subqueries to query and retrieve data
  • How to group data using GROUP BY clause
  • How to use the HAVING clause to group data

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about CRUD, SQL and PostgreSQL

How to build a CRUD Web App with Angular 8.0

Laravel 5.8 CRUD Tutorial With Example Step By Step

MEAN Stack Angular 8 CRUD Web Application

Build a Basic CRUD App with Node and React

Learn Database Administration - PostgreSQL Database Administration (DBA) for Beginners

How to write SQL queries in PostgreSQL

How to write SQL queries in PostgreSQL

How to write SQL queries in PostgreSQL

In this tutorial, you will learn how to write simple SQL queries in PostgreSQL.

Being able to query the relational database systems is a must-have skill for a data scientist. SQL or Structured Query Language lets you do this in a very efficient way. SQL not only enables you to you ask meaningful questions to the data but also allows you to you play with the data in many different ways. Without databases, practically no real-world application is possible. So, the knowledge of databases and being able to handle them are crucial parts of a data scientist's toolbox.

Quick fact: SQL is also called SE-QU-EL. It has got some historical significance - the initial name of SQL was Simple English Query Language.

Generally, relational databases look like the following -

Relations are also called tables. There are a number of ways in which databases can be represented. This is just one of them and the most popular one.

This tutorial introduces the four most common operations performed with SQL, and they are Create, Read, Update and Delete. Collectively these four operations are often referred to as CRUD. In any application that involves user interaction when these four operations are always there.

You will be using PostgreSQL as the relational database management system. PostgreSQL is very light-weight, and it is free as well. In this tutorial, you will

  • Get up and running with PostgreSQL
  • Connect to a PostgreSQL database
  • Create, read, update and delete tables in that database
  • Run SQL on Jupyter Notebook
  • Run SQL in Python

Let's get started.

Getting up and running with PostgreSQL

PostgreSQL is a light-weight and an open source RDBMS. It is extremely well accepted by the industry. You can learn more about PostgreSQL from its official website.

To be able to start writing and executing queries in PostgreSQL, you will need it installed on your machine. Installing it is extremely easy. The following two short videos show you how PostgreSQL can be downloaded and installed on a 32-bit Windows-7 machine

Note: While you are installing PostgreSQL take note of the password and port number that you are entering.

Once you have installed PostgreSQL successfully on your machine, open up pgAdmin. pgAdmin is a handy utility which comes with the PostgreSQL installation, and it lets you do regular database related tasks through a nice graphical interface. pgAdmin's interface looks like

When you open up pgAdmin, you will see a server named "PostgreSQL 9.4 (localhost:5432)" enlisted in the interface

Note: Your version may be different than the above and so the port number (5432).

Connect to the server by entering the password that you gave during the installation. For reference - https://bit.ly/2FPO4hR.

Once you have successfully connected to the local database server, you will get an interface similar to the following

CRUD operations in PostgreSQL

Creating a table according to a given specification -

To be able to operate on a database you will need a table. So let's go ahead and create a simple table (also called relation) called datacamp_courses with the following specification (schema)

The specification gives us quite a few information on the columns of the table

  • The primary key of the table should be course_id (note that only this one is bold) and its data-type should be an integer. A primary key is a constraint which enforces the column values to be non-null and unique. It lets you uniquely identify a specific or a set of instanced present in the table.
  • Rest of the information in the specification should be easy to interpret now.

To create a table, right-click on the newly created database DataCamp_Courses and select CREATE Script from the options. You should get something similar to the following

Let's execute the following query now

CREATE TABLE datacamp_courses(
 course_id SERIAL PRIMARY KEY,
 course_name VARCHAR (50) UNIQUE NOT NULL,
 course_instructor VARCHAR (100) NOT NULL,
 topic VARCHAR (20) NOT NULL
);

For executing the query just select it and click the execute button from the menu bar

The output should be

The general structure of a table creation query in PostgreSQL looks like

CREATE TABLE table_name (
 column_name TYPE column_constraint,
 table_constraint table_constraint
)

We did not specify any table_constraints while creating the table. That can be avoided for now. Everything else is quite readable except for the keyword SERIAL. Serial in PostgreSQL lets you create an auto-increment column. By default, it creates values of type integer. Serial frees us from the burden of remembering the last inserted/updated primary key of a table, and it is a good practice to use auto-increments for primary keys. You can learn more about serial from here.

Inserting some records to the newly created table

In this step, you will insert some records to the table. Your records should contain

  • A course name
  • Instructor's name of the course
  • Course topic

The values for the column course_id will be handled by PostgreSQL itself. The general structure of an insert query in PostgreSQL looks like

INSERT INTO table(column1, column2, …)
VALUES
 (value1, value2, …);

Let's insert some records

INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Deep Learning in Python','Dan Becker','Python');

INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Joining Data in PostgreSQL','Chester Ismay','SQL');

Note that you did not specify the primary keys explicitly. You will see its effects in a moment.

When you execute the above two queries, you should get the following result upon successful insertions

Query returned successfully: one row affected, 11 ms execution time.

Reading/viewing the data from the table -

This is probably something you will do a lot in your data science journey. For now, let's see how is the table datacamp_courses holding up.

This is generally called a select query, and the generic structure of a select query looks like

SELECT
 column_1,
 column_2,
 ...
FROM
 table_name;

Let's select all the columns from the table datacamp_courses

SELECT * FROM datacamp_courses;

And you get

Note the primary keys now. If you want to just see the names of the courses you can do so by

SELECT course_name from datacamp_courses;

And you get

You can specify as many column names as possible which you may want to see in your results provided they exist in the table. If you run select course_name, number_particpants from datacamp_courses; you will run into error as the column number_particpants does exist in the table. You will now see how you can update a specific record in the table.

Updating a record in the table

The general structure of an update query in SQL looks like the following:

UPDATE table
SET column1 = value1,
    column2 = value2 ,...
WHERE
 condition;

You are going to update the record where course_instructor = "Chester Ismay" and set the course_name to "Joining Data in SQL". You will then verify if the record is updated. The query for doing this would be

UPDATE datacamp_courses SET course_name = 'Joining Data in SQL'
WHERE course_instructor = 'Chester Ismay';

Let's see if your update query had the intended effect by running a select query

You can see your update query performed exactly in the way you wanted. You will now see how you can delete a record from the table.

Deleting a record in the table

The general structure of a delete query in SQL looks like following:

DELETE FROM table
WHERE condition;

You are going to delete the record where course_name = "Deep Learning in Python" and then verify if the record is deleted. Following the structure, you can see that the following query should be able to do this

DELETE from datacamp_courses
WHERE course_name = 'Deep Learning in Python';

Keep in mind that the keywords are not case-sensitive in SQL, but the data is case-sensitive. This is why you see a mixture of upper case and lower case in the queries.

Let's see if the intended record was deleted from the table or not

And yes, it indeed deleted the intended record.

The generic structures of the queries as mentioned in the tutorial are referred from postgresqltutorial.com.

You now know how to basic CRUD queries in SQL. Some of you may use Jupyter Notebooks heavily and may be thinking it would be great if there were an option to execute these queries directly from Jupyter Notebook. In the next section, you will see how you can achieve this.

SQL + Jupyter Notebooks

To be able to run SQL queries from Jupyter Notebooks the first step will be to install the ipython-sql package.

If it is not installed, install it using:

pip install ipython-sql

Once this is done, load the sql extension in your Jupyter Notebook by executing

%load_ext sql

The next step will be to connect to a PostgreSQL database. You will connect to the database that you created -DataCamp_Courses.

For being able to connect to a database that is already created in your system, you will have to instruct Python to detect its dialect. In simpler terms, you will have to tell Python that it is a PostgreSQL database. For that, you will need psycopg2 which can be installed using:

pip install psycopg2

Once you installed psycopg connect to the database using

%sql postgresql://postgres:[email protected]:5432/DataCamp_Courses
'Connected: [email protected]_Courses'

Note the usage of %sql. This is a magic command. It lets you execute SQL statements from Jupyter Notebook. What follows %sql is called a database connection URL where you specify

  • Dialect (postgres)
  • Username (postgres)
  • Password (postgres)
  • Server address (localhost)
  • Port number (5432)
  • Database name (DaaCamp_Courses)

You can now perform everything from you Jupyter Notebook that you performed in the pgAdmin interface. Let's start by creating the table datacamp_courses with the exact same schema.

But before doing that you will have to drop the table as SQL won't let you store two tables with the same name. You can drop a table by

%sql DROP table datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
Done.





[]

The table datacamp_courses is now deleted from PostgreSQL and hence you can create a new table with this name.

%%sql
CREATE TABLE datacamp_courses(
 course_id SERIAL PRIMARY KEY,
 course_name VARCHAR (50) UNIQUE NOT NULL,
 course_instructor VARCHAR (100) NOT NULL,
 topic VARCHAR (20) NOT NULL
);
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
Done.





[]

Note the usage of %sql %%sql here. For executing a single line of query, you can use %sql, but if you want to execute multiple queries in one go, you will have to use %%sql.

Let's insert some records

%%sql
INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Deep Learning in Python','Dan Becker','Python');
INSERT INTO datacamp_courses(course_name, course_instructor, topic)
VALUES('Joining Data in PostgreSQL','Chester Ismay','SQL');
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.
1 rows affected.





[]

View the table to make sure the insertions were done as expected

%%sql
select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
2 rows affected.

Let's maintain the flow. As the next step, you will update a record in the table

%sql update datacamp_courses set course_name = 'Joining Data in SQL' where course_instructor = 'Chester Ismay';
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.





[]

Pay close attention when you are dealing with strings in SQL. Unlike traditional programming languages, the strings values need to be wrapped using single quotes.

Let's now verify if your update query had the intended effect

%%sql
select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
2 rows affected.

Let's now delete a record and verify

%%sql
delete from datacamp_courses where course_name = 'Deep Learning in Python';
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.





[]
%%sql
select * from datacamp_courses;
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.

By now you have a clear idea of executing CRUD operations in PostgreSQL and how you can perform them via Jupyter Notebook. If you are familiar with Python and if you interested in accessing your database through your Python code you can also do it. The next section is all about that.

Getting started with SQLAlchemy and combining it with SQL magic commands

For this section, you will need the SQLAlchemy package. It comes with the Anaconda distribution generally. You can also pip-install it. Once you have it installed, you can import it by -

import sqlalchemy

To able to interact with your databases using SQLAlchemy you will need to create an engine for the respective RDBMS where your databases are stored. In your case, it is PostgreSQL. SQLAlchemy lets you create an engine of the RDBMS in just a single call of create_engine(), and the method takes a database connection URL which you have seen before.

from sqlalchemy import create_engine
engine = create_engine('postgresql://postgres:[email protected]:5432/DataCamp_Courses')
print(engine.table_names()) # Lets you see the names of the tables present in the database
['datacamp_courses']

You can see the table named datacamp_courses which further confirms that you were successful in creating the engine. Let's execute a simple select query to see the records of the table datacamp_courses and store it in a pandas DataFrame object.

You will use the read_sql() method (provided by pandas) which takes a SQL query string and an engine.

import pandas as pd

df = pd.read_sql('select * from datacamp_courses', engine)
df.head()

You can also pair up the %sql magic command within your Python code.

df_new = %sql select * from datacamp_courses
df_new.DataFrame().head()
 * postgresql://postgres:***@localhost:5432/DataCamp_Courses
1 rows affected.

What are the differences between Standard SQL and Transact-SQL?

What are the differences between Standard SQL and Transact-SQL?

In this article, we'll explain syntax differences between standard SQL and the Transact-SQL language dedicated to interacting with the SQL

#1 Names of Database Objects

In relational database systems, we name tables, views, and columns, but sometimes we need to use the same name as a keyword or use special characters. In standard SQL, you can place this kind of name in quotation marks (""), but in T-SQL, you can also place it in brackets ([]). Look at these examples for the name of a table in T-SQL:

CREATE TABLE dbo.test.“first name” ( Id INT, Name VARCHAR(100));
CREATE TABLE dbo.test.[first name]  ( Id INT, Name VARCHAR(100));

Only the first delimiter (the quotation marks) for the special name is also part of the SQL standard.

What Is Different in a SELECT Statement?#2 Returning Values

The SQL standard does not have a syntax for a query returning values or values coming from expressions without referring to any columns of a table, but MS SQL Server does allow for this type of expression. How? You can use a SELECT statement alone with an expression or with other values not coming from columns of the table. In T-SQL, it looks like the example below:

SELECT 12/6 ;

In this expression, we don’t need a table to evaluate 12 divided by 6, therefore, the FROM statement and the name of the table can be omitted.

#3 Limiting Records in a Result Set

In the SQL standard, you can limit the number of records in the results by using the syntax illustrated below:

SELECT * FROM tab FETCH FIRST 10 ROWS ONLY

T-SQL implements this syntax in a different way. The example below shows the MS SQL Server syntax:

SELECT * FROM tab ORDER BY col1 DESC OFFSET 0 ROWS FETCH FIRST 10 ROWS ONLY;

As you notice, this uses an ORDER BY clause. Another way to select rows, but without ORDER BY, is by using the TOP clause in T-SQL:

SELECT TOP 10 * FROM tab;
#4 Automatically Generating Values

The SQL standard enables you to create columns with automatically generated values. The syntax to do this is shown below:

CREATE TABLE tab (id DECIMAL GENERATED ALWAYS AS IDENTITY);

In T-SQL we can also automatically generate values, but in this way:

CREATE TABLE tab (id INTEGER IDENTITY);
#5 Math Functions

Several common mathematical functions are part of the SQL standard. One of these math functions is CEIL(x), which we don’t find in T-SQL. Instead, T-SQL provides the following non-standard functions: SIGN(x), ROUND(x,[,d]) to round decimal value x to the number of decimal positions, TRUNC(x) for truncating to given number of decimal places, LOG(x) to return the natural logarithm for a value x, and RANDOM() to generate random numbers. The highest or lowest number in a list in the SQL standard is returned by MAX(list) and MIN(list) functions, but in Transact-SQL, you use the GREATEST(list) and LEAST(list) functions.

T-SQL function ROUND:

SELECT ROUND(col) FROM tab;

#6 Aggregate Functions

We find another syntax difference with the aggregate functions. The functions COUNT, SUM, and AVG all take an argument related to a count. T-SQL allows the use of DISTINCT before these argument values so that rows are counted only if the values are different from other rows. The SQL standard doesn't allow for the use of DISTINCT in these functions.

Standard SQL:
SELECT COUNT(col) FROM tab;

T-SQL:
SELECT COUNT(col) FROM tab;

SELECT COUNT(DISTINCT col) FROM tab;

But in T-SQL we don’t find a population covariance function: COVAR_POP(x,y), which is defined in the SQL standard.

#7 Retrieving Parts of Dates and Times

Most relational database systems deliver many functions to operate on dates and times.

In standard SQL, the EXTRACT(YEAR FROM x) function and similar functions to select parts of dates are different from the T-SQL functions like YEAR(x) or DATEPART(year, x).

There is also a difference in getting the current date and time. Standard SQL allows you to get the current date with the CURRENT_DATE function, but in MS SQL Server, there is not a similar function, so we have to use the GETDATE function as an argument in the CAST function to convert to a DATE data type.

#8 Operating on Strings

Using functions to operate on strings is also different between the SQL standard and T-SQL. The main difference is found in removing trailing and leading spaces from a string. In standard SQL, there is the TRIM function, but in T-SQL, there are several related functions: TRIM (removing trailing and leading spaces), LTRIM (removing leading spaces), and RTRIM (removing trailing spaces).

Another very-often-used string function is SUBSTRING.

The standard SQL syntax for the SUBSTRING function looks like:

SUBSTRING(str FROM start [FOR len])

but in T-SQL, the syntax of this function looks like:

SUBSTRING(str, start, length)

There are reasons sometimes to add values coming from other columns and/or additional strings. Standard SQL enables the following syntax to do this:

As you can see, this syntax makes use of the || operator to add one string to another.

But the equivalent operator in T-SQL is the plus sign character. Look at this example:

SELECT col1 + col2  FROM tab;

In SQL Server, we also have the possibility to use the CONCAT function concatenates a list of strings:

SELECT CONCAT(col1, str1, col2, ...)  FROM tab;

We can also repeat one character several times. Standard SQL defines the function REPEAT(str, n) to do this. Transact-SQL provides the REPLICATE function. For example:

SELECT  REPLICATE(str, x);

where x indicates how many times to repeat the string or character.

#9 Inequality Operator

During filtering records in a SELECT statement, sometimes we have to use an inequality operator. Standard SQL defines <> as this operator, while T-SQL allows for both the standard operator and the != operator:

SELECT col3 FROM tab WHERE col1 != col2;
#10 ISNULL Function

In T-SQL, we have the ability to replace NULL values coming from a column using the ISNULL function. This is a function that is specific to T-SQL and is not in the SQL standard.

SELECT ISNULL(col1) FROM tab;
Which Parts of DML Syntax Are Different?

In T-SQL, the basic syntax of DELETE, UPDATE, and INSERT queries is the same as the SQL standard, but differences appear in more advanced queries. Let’s look at them.

#11 OUTPUT Keyword

The OUTPUT keyword occurs in DELETE, UPDATE, and INSERT statements. It is not defined in standard SQL.

Using T-SQL, we can see extra information returned by a query. It returns both old and new values in UPDATE or the values added using INSERT or deleted using DELETE. To see this information, we have to use prefixes in INSERT, UPDATE, and DELETE.

UPDATE tab SET col='new value'
OUTPUT Deleted.col, Inserted.col;

We see the result of changing records with the previous and new values in an updated column. The SQL standard does not support this feature.

#12 Syntax for INSERT INTO ... SELECT

Another structure of an INSERT query is INSERT INTO … SELECT. T-SQL allows you to insert data from another table into a destination table. Look at this query:

INSERT INTO tab SELECT col1,col2,... FROM tab_source;

It is not a standard feature but a feature characteristic of SQL Server.

#13 FROM Clause in DELETE and UPDATE

SQL Server provides extended syntax of the UPDATE and DELETE with FROM clauses. You can use DELETE with FROM to use the rows from one table to remove corresponding rows in another table by referring to a primary key and a foreign key. Similarly, you can use UPDATE with FROM update rows from one table by referring to the rows of another table using common values (primary key in one table and foreign key in second, e.g. the same city name). Here is an example:

DELETE FROM Book
FROM Author
WHERE Author.Id=Book.AuthorId AND Author.Name IS NULL;

UPDATE Book
SET Book.Price=Book.Price*0.2
FROM Author
WHERE Book.AuthorId=Author.Id AND Author.Id=12;

The SQL standard doesn’t provide this syntax.

#14 INSERT, UPDATE, and DELETE With JOIN

You can also use INSERT, UPDATE, and DELETE using JOIN to connect to another table. An example of this is:

DELETE ItemOrder FROM ItemOrder
JOIN Item ON ItemOrder.ItemId=Item.Id
WHERE YEAR(Item.DeliveredDate) <= 2017;

This feature is not in the SQL standard.

Summary

This article does not cover all the issues about syntax differences between the SQL standard and T-SQL using the MS SQL Server system. However, this guide helps point out some basic features characteristic only of Transact-SQL and what SQL standard syntax isn’t implemented by MS SQL Server.

Thanks for reading. If you liked this post, share it with all of your programming buddies!

Originally published on https://dzone.com