Using SQL to Query CSVs in Command Line

If you are familiar with coding in SQL, there is a strong chance you do it in PgAdmin, MySQL, BigQuery, SQL Server, etc. But there are times you just want to use your SQL skills for quick analysis on a small/medium sized dataset.

With csvkit you can run any SQL on your CSV files right in your command line.

[csvkit](https://csvkit.readthedocs.io/en/latest/) is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats. Once you have csvkit installed you can use csvsql to run your SQL commands.

1. Installation

If you don’t have csvkit installed, head over here and follow the installation instructions or if you’re familiar with pip you can do the following.

pip install csvkit

You can view the csvkit documentation using below.

csvsql -h

2. Syntax

Now that you are all set up, you can follow this simple structure to run your queries. It is essential to note the SQL query must be written in quotation marks and **must **be in a single line. No line breaks.

csvsql --query "ENTER YOUR SQL QUERY HERE"
FILE_NAME.csv

That’s it! Follow this basic code skeleton, and you are good to go.

Make sure you are in the same working directory as where the CSV file is located.

3. Example

Below is an example of setting the directory and getting our first SQL command up and running in.

#data-analysis #sql #data-science #command-line #coding #data analysis

1. Installation

2. Syntax

3. Example

towardsdatascience.com

Using SQL to Query CSVs in Command Line