If you are familiar with coding in SQL, there is a strong chance you do it in PgAdmin, MySQL, BigQuery, SQL Server, etc. But there are times you just want to use your SQL skills for quick analysis on a small/medium sized dataset.
With csvkit
you can run any SQL on your CSV files right in your command line.
[csvkit](https://csvkit.readthedocs.io/en/latest/)
is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats. Once you have csvkit
installed you can use csvsql
to run your SQL commands.
If you don’t have csvkit
installed, head over here and follow the installation instructions or if you’re familiar with pip
you can do the following.
pip install csvkit
You can view the csvkit
documentation using below.
csvsql -h
Now that you are all set up, you can follow this simple structure to run your queries. It is essential to note the SQL query must be written in quotation marks and **must **be in a single line. No line breaks.
csvsql --query "ENTER YOUR SQL QUERY HERE"
FILE_NAME.csv
That’s it! Follow this basic code skeleton, and you are good to go.
Make sure you are in the same working directory as where the CSV file is located.
Below is an example of setting the directory and getting our first SQL command up and running in.
#data-analysis #sql #data-science #command-line #coding #data analysis