Relational databases like Postgres include a set of tables that describe the tables in the database. This set of metadata tables is called the **catalog **and it contains a treasure trove of details about the database. I recently needed to write a program to automatically extract insights from database catalogs and this led me to write a simple Python module to connect to a Postgres database, get information from the catalog, and load that information into a Pandas dataframe for further processing in Python. This article describes the process I followed.

Introduction

One of the key facts about Postgres is that it has not one but two catalogs:

  • ANSI (information_schema): this catalog contains the common relational database information defined for the ANSI standard. If you limit your catalog use to information_schema your code should work with other relational databases that implement the ANSI standard.
  • PostgreSQL (pg_catalog): this catalog contains Postgres-specific metadata. If your code depends on this catalog it will need to updated before it can be used with other relational databases.

#catalog #relational-databases #postgres #python #data-science

Examining the Postgres catalog with Python
1.35 GEEK