More and more companies are using PostgreSQL, one of the world’s most popular relational database management systems (RDBMs). Personally, I am a big fan for two reasons. PostgreSQL allows you to easily distribute queries across multiple nodes running in parallel and to create custom procedures, and it also supports advanced SQL and numerous window functions, aggregates, table samplings, and other cost-based query optimizations.

This post will focus on PostgreSQL joins and show you a way to connect two tables that are not related by a foreign key relationship.

As you may know, joins are used to combine data from two different tables. The way you combine them depends on the type of join you use. In order to join two or more tables, they have to be related by a key; if not, it’s not easy to join them. Theoretically, you wouldn’t run into this case, because the whole point of doing a join is to extract data that has a specific relationship between tables. If there is no relationship, why would you need to join?

Here are two of the most common cases that require joining unrelated tables:

  • • You might need to create your own connection between two tables that store related data but are missing the foreign key. This is a fairly rare scenario that usually results from poor database management; for the data analyst, it’s always useful to have some joining tricks in your toolbox in case it ever comes up.
  • • The data is going from your database to any number of visualization tools through different channels (data sources). These data channels often have similar or related data, but since they have different sources (e.g., Salesforce, Google Analytics, etc.), you might need to create a relationship between them. In my experience, this is the more common of the two scenarios.

As a data analyst, I often work with different visualization tools that are connected to a database (RedShift in this case) through data stores and sources that pipe data from the database to different dashboards (such as Tableau, Chart.io, Looker, and other applications). We often have to customize a chart to represent data from unrelated tables or even from multiple sources. Once I ran into a case where I had to join Table A and Table B that were not related; that is, they did not have related columns. Below I’ll describe my method of joining these tables by creating an ID column for each table, then doing a classic join on that ID.

My goal in joining these two unrelated tables was to get data in a very specific format for a chart within a dashboard that was connected to a data source. In this case, I couldn’t apply UNION or UNION ALL (which would be my first solution), because each of my SELECT statements had a different number of columns with different data types.

Therefore, my solution was to create a unique ID for each table inside of a subquery, and then do a self-join on that ID, as depicted in the following steps.

#sql #postgres #sql-server

How to Join Unrelated Tables using PostgreSQL
2.60 GEEK