Databases are a key component of many websites and applications, and are at the core of how data is stored and exchanged across the internet. One of the most important aspects of database management is the practice of retrieving data from a database, whether it’s on an ad hoc basis or part of a process that’s been coded into an application. There are several ways to retrieve information from a database, but one of the most commonly-used methods is performed through submitting queries through the command line.

In relational database management systems, a query is any command used to retrieve data from a table. In Structured Query Language (SQL), queries are almost always made using the SELECT statement.

In this guide, we will discuss the basic syntax of SQL queries as well as some of the more commonly-employed functions and operators. We will also practice making SQL queries using some sample data in a MySQL database.

MySQL is an open-source relational database management system. One of the most widely-deployed SQL-databases, MySQL prioritizes speed, reliability, and usability. It generally follows the ANSI SQL standard, although there are a few cases where MySQL performs operations differently than the recognized standard.


In general, the commands and concepts presented in this guide can be used on any Linux-based operating system running any SQL database software. However, it was written specifically with an Ubuntu 18.04 server running MySQL in mind. To set this up, you will need the following:

With this setup in place, we can begin the tutorial.

Creating a Sample Database

Before we can begin making queries in SQL, we will first create a database and a couple tables, then populate these tables with some sample data. This will allow you to gain some hands-on experience when you begin making queries later on.

For the sample database we’ll use throughout this guide, imagine the following scenario:

You and several of your friends all celebrate your birthdays with one another. On each occasion, the members of the group head to the local bowling alley, participate in a friendly tournament, and then everyone heads to your place where you prepare the birthday-person’s favorite meal.

Now that this tradition has been going on for a while, you’ve decided to begin tracking the records from these tournaments. Also, to make planning dinners easier, you decide to create a record of your friends’ birthdays and their favorite entrees, sides, and desserts. Rather than keep this information in a physical ledger, you decide to exercise your database skills by recording it in a MySQL database.

To begin, open up a MySQL prompt as your root MySQL user:

sudo mysql

Note: If you followed the prerequisite the tutorial on Installing MySQL on Ubuntu 18.04, you may have configured your root user to authenticate using a password. In this case, you will connect to the MySQL prompt with the following command:

mysql -u root -p

Next, create the database by running:


Then select this database by typing:

USE birthdays;

Next, create two tables within this database. We’ll use the first table to track your friends’ records at the bowling alley. The following command will create a table called tourneys with columns for the name of each of your friends, the number of tournaments they’ve won (wins), their all-time best score, and what size bowling shoe they wear (size):

CREATE TABLE tourneys (
name varchar(30),
wins real,
best real,
size real

Once you run the CREATE TABLE command and populate it with column headings, you’ll receive the following output:

OutputQuery OK, 0 rows affected (0.00 sec)

Populate the tourneys table with some sample data:

INSERT INTO tourneys (name, wins, best, size)
VALUES (‘Dolly’, ‘7’, ‘245’, ‘8.5’),
(‘Etta’, ‘4’, ‘283’, ‘9’),
(‘Irma’, ‘9’, ‘266’, ‘7’),
(‘Barbara’, ‘2’, ‘197’, ‘7.5’),
(‘Gladys’, ‘13’, ‘273’, ‘8’);

You’ll receive an output like this:

OutputQuery OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0

Following this, create another table within the same database which we’ll use to store information about your friends’ favorite birthday meals. The following command creates a table named dinners with columns for the name of each of your friends, their birthdate, their favorite entree, their preferred side dish, and their favorite dessert:

CREATE TABLE dinners (
name varchar(30),
birthdate date,
entree varchar(30),
side varchar(30),
dessert varchar(30)

Similarly for this table, you’ll receive feedback confirming that the command ran successfully:

OutputQuery OK, 0 rows affected (0.01 sec)

Populate this table with some sample data as well:

INSERT INTO dinners (name, birthdate, entree, side, dessert)
VALUES (‘Dolly’, ‘1946-01-19’, ‘steak’, ‘salad’, ‘cake’),
(‘Etta’, ‘1938-01-25’, ‘chicken’, ‘fries’, ‘ice cream’),
(‘Irma’, ‘1941-02-18’, ‘tofu’, ‘fries’, ‘cake’),
(‘Barbara’, ‘1948-12-25’, ‘tofu’, ‘salad’, ‘ice cream’),
(‘Gladys’, ‘1944-05-28’, ‘steak’, ‘fries’, ‘ice cream’);

OutputQuery OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0

Once that command completes successfully, you’re done setting up your database. Next, we’ll go over the basic command structure of SELECT queries.

Understanding SELECT Statements

As mentioned in the introduction, SQL queries almost always begin with the SELECT statement. SELECT is used in queries to specify which columns from a table should be returned in the result set. Queries also almost always include FROM, which is used to specify which table the statement will query.

Generally, SQL queries follow this syntax:

SELECT column_to_select FROM table_to_select WHERE certain_conditions_apply;

By way of example, the following statement will return the entire name column from the dinners table:

SELECT name FROM dinners;

| name |
| Dolly |
| Etta |
| Irma |
| Barbara |
| Gladys |
5 rows in set (0.00 sec)

You can select multiple columns from the same table by separating their names with a comma, like this:

SELECT name, birthdate FROM dinners;

| name | birthdate |
| Dolly | 1946-01-19 |
| Etta | 1938-01-25 |
| Irma | 1941-02-18 |
| Barbara | 1948-12-25 |
| Gladys | 1944-05-28 |
5 rows in set (0.00 sec)

Instead of naming a specific column or set of columns, you can follow the SELECT operator with an asterisk (*) which serves as a placeholder representing all the columns in a table. The following command returns every column from the tourneys table:

SELECT * FROM tourneys;

| name | wins | best | size |
| Dolly | 7 | 245 | 8.5 |
| Etta | 4 | 283 | 9 |
| Irma | 9 | 266 | 7 |
| Barbara | 2 | 197 | 7.5 |
| Gladys | 13 | 273 | 8 |
5 rows in set (0.00 sec)

WHERE is used in queries to filter records that meet a specified condition, and any rows that do not meet that condition are eliminated from the result. A WHERE clause typically follows this syntax:

. . . WHERE column_name comparison_operator value

The comparison operator in a WHERE clause defines how the specified column should be compared against the value. Here are some common SQL comparison operators:

Operator What it does = tests for equality != tests for inequality < tests for less-than > tests for greater-than <= tests for less-than or equal-to >= tests for greater-than or equal-to BETWEEN tests whether a value lies within a given range IN tests whether a row’s value is contained in a set of specified values EXISTS tests whether rows exist, given the specified conditions LIKE tests whether a value matches a specified string IS NULL tests for NULL values IS NOT NULL tests for all values other than NULL For example, if you wanted to find Irma’s shoe size, you could use the following query:

SELECT size FROM tourneys WHERE name = ‘Irma’;

| size |
| 7 |
1 row in set (0.00 sec)

SQL allows the use of wildcard characters, and these are especially handy when used in WHERE clauses. Percentage signs (%) represent zero or more unknown characters, and underscores (_) represent a single unknown character. These are useful if you’re trying to find a specific entry in a table, but aren’t sure of what that entry is exactly. To illustrate, let’s say that you’ve forgotten the favorite entree of a few of your friends, but you’re certain this particular entree starts with a “t.” You could find its name by running the following query:

SELECT entree FROM dinners WHERE entree LIKE ‘t%’;

| entree |
| tofu |
| tofu |
2 rows in set (0.00 sec)

Based on the output above, we see that the entree we have forgotten is tofu.

There may be times when you’re working with databases that have columns or tables with relatively long or difficult-to-read names. In these cases, you can make these names more readable by creating an alias with the AS keyword. Aliases created with AS are temporary, and only exist for the duration of the query for which they’re created:

SELECT name AS n, birthdate AS b, dessert AS d FROM dinners;

| n | b | d |
| Dolly | 1946-01-19 | cake |
| Etta | 1938-01-25 | ice cream |
| Irma | 1941-02-18 | cake |
| Barbara | 1948-12-25 | ice cream |
| Gladys | 1944-05-28 | ice cream |
5 rows in set (0.00 sec)

Here, we have told SQL to display the name column as n, the birthdate column as b, and the dessert column as d.

The examples we’ve gone through up to this point include some of the more frequently-used keywords and clauses in SQL queries. These are useful for basic queries, but they aren’t helpful if you’re trying to perform a calculation or derive a scalar value (a single value, as opposed to a set of multiple different values) based on your data. This is where aggregate functions come into play.

Aggregate Functions

Oftentimes, when working with data, you don’t necessarily want to see the data itself. Rather, you want information about the data. The SQL syntax includes a number of functions that allow you to interpret or run calculations on your data just by issuing a SELECT query. These are known as aggregate functions.

The COUNT function counts and returns the number of rows that match a certain criteria. For example, if you’d like to know how many of your friends prefer tofu for their birthday entree, you could issue this query:

SELECT COUNT(entree) FROM dinners WHERE entree = ‘tofu’;

| COUNT(entree) |
| 2 |
1 row in set (0.00 sec)

The AVG function returns the average (mean) value of a column. Using our example table, you could find the average best score amongst your friends with this query:

SELECT AVG(best) FROM tourneys;

| AVG(best) |
| 252.8 |
1 row in set (0.00 sec)

SUM is used to find the total sum of a given column. For instance, if you’d like to see how many games you and your friends have bowled over the years, you could run this query:

SELECT SUM(wins) FROM tourneys;

| SUM(wins) |
| 35 |
1 row in set (0.00 sec)

Note that the AVG and SUM functions will only work correctly when used with numeric data. If you try to use them on non-numerical data, it will result in either an error or just 0, depending on which RDBMS you’re using:

SELECT SUM(entree) FROM dinners;

| SUM(entree) |
| 0 |
1 row in set, 5 warnings (0.00 sec)

MIN is used to find the smallest value within a specified column. You could use this query to see what the worst overall bowling record is so far (in terms of number of wins):

SELECT MIN(wins) FROM tourneys;

| MIN(wins) |
| 2 |
1 row in set (0.00 sec)

Similarly, MAX is used to find the largest numeric value in a given column. The following query will show the best overall bowling record:

SELECT MAX(wins) FROM tourneys;

| MAX(wins) |
| 13 |
1 row in set (0.00 sec)

Unlike SUM and AVG, the MIN and MAX functions can be used for both numeric and alphabetic data types. When run on a column containing string values, the MIN function will show the first value alphabetically:

SELECT MIN(name) FROM dinners;

| MIN(name) |
| Barbara |
1 row in set (0.00 sec)

Likewise, when run on a column containing string values, the MAX function will show the last value alphabetically:

SELECT MAX(name) FROM dinners;

| MAX(name) |
| Irma |
1 row in set (0.00 sec)

Aggregate functions have many uses beyond what was described in this section. They’re particularly useful when used with the GROUP BY clause, which is covered in the next section along with several other query clauses that affect how result sets are sorted.

Manipulating Query Outputs

In addition to the FROM and WHERE clauses, there are several other clauses which are used to manipulate the results of a SELECT query. In this section, we will explain and provide examples for some of the more commonly-used query clauses.

One of the most frequently-used query clauses, aside from FROM and WHERE, is the GROUP BY clause. It’s typically used when you’re performing an aggregate function on one column, but in relation to matching values in another.

For example, let’s say you wanted to know how many of your friends prefer each of the three entrees you make. You could find this info with the following query:

SELECT COUNT(name), entree FROM dinners GROUP BY entree;

| COUNT(name) | entree |
| 1 | chicken |
| 2 | steak |
| 2 | tofu |
3 rows in set (0.00 sec)

The ORDER BY clause is used to sort query results. By default, numeric values are sorted in ascending order, and text values are sorted in alphabetical order. To illustrate, the following query lists the name and birthdate columns, but sorts the results by birthdate:

SELECT name, birthdate FROM dinners ORDER BY birthdate;

| name | birthdate |
| Etta | 1938-01-25 |
| Irma | 1941-02-18 |
| Gladys | 1944-05-28 |
| Dolly | 1946-01-19 |
| Barbara | 1948-12-25 |
5 rows in set (0.00 sec)

Notice that the default behavior of ORDER BY is to sort the result set in ascending order. To reverse this and have the result set sorted in descending order, close the query with DESC:

SELECT name, birthdate FROM dinners ORDER BY birthdate DESC;

| name | birthdate |
| Barbara | 1948-12-25 |
| Dolly | 1946-01-19 |
| Gladys | 1944-05-28 |
| Irma | 1941-02-18 |
| Etta | 1938-01-25 |
5 rows in set (0.00 sec)

As mentioned previously, the WHERE clause is used to filter results based on specific conditions. However, if you use the WHERE clause with an aggregate function, it will return an error, as is the case with the following attempt to find which sides are the favorite of at least three of your friends:

SELECT COUNT(name), side FROM dinners WHERE COUNT(name) >= 3;

OutputERROR 1111 (HY000): Invalid use of group function

The HAVING clause was added to SQL to provide functionality similar to that of the WHERE clause while also being compatible with aggregate functions. It’s helpful to think of the difference between these two clauses as being that WHERE applies to individual records, while HAVING applies to group records. To this end, any time you issue a HAVING clause, the GROUP BY clause must also be present.

The following example is another attempt to find which side dishes are the favorite of at least three of your friends, although this one will return a result without error:

SELECT COUNT(name), side FROM dinners GROUP BY side HAVING COUNT(name) >= 3;

| COUNT(name) | side |
| 3 | fries |
1 row in set (0.00 sec)

Aggregate functions are useful for summarizing the results of a particular column in a given table. However, there are many cases where it’s necessary to query the contents of more than one table. We’ll go over a few ways you can do this in the next section.

Querying Multiple Tables

More often than not, a database contains multiple tables, each holding different sets of data. SQL provides a few different ways to run a single query on multiple tables.

The JOIN clause can be used to combine rows from two or more tables in a query result. It does this by finding a related column between the tables and sorts the results appropriately in the output.

SELECT statements that include a JOIN clause generally follow this syntax:

SELECT table1.column1, table2.column2
FROM table1
JOIN table2 ON table1.related_column=table2.related_column;

Note that because JOIN clauses compare the contents of more than one table, the previous example specifies which table to select each column from by preceding the name of the column with the name of the table and a period. You can specify which table a column should be selected from like this for any query, although it’s not necessary when selecting from a single table, as we’ve done in the previous sections. Let’s walk through an example using our sample data.

Imagine that you wanted to buy each of your friends a pair of bowling shoes as a birthday gift. Because the information about your friends’ birthdates and shoe sizes are held in separate tables, you could query both tables separately then compare the results from each. With a JOIN clause, though, you can find all the information you want with a single query:

SELECT, tourneys.size, dinners.birthdate
FROM tourneys
JOIN dinners ON;

| name | size | birthdate |
| Dolly | 8.5 | 1946-01-19 |
| Etta | 9 | 1938-01-25 |
| Irma | 7 | 1941-02-18 |
| Barbara | 7.5 | 1948-12-25 |
| Gladys | 8 | 1944-05-28 |
5 rows in set (0.00 sec)

The JOIN clause used in this example, without any other arguments, is an inner JOIN clause. This means that it selects all the records that have matching values in both tables and prints them to the results set, while any records that aren’t matched are excluded. To illustrate this idea, let’s add a new row to each table that doesn’t have a corresponding entry in the other:

INSERT INTO tourneys (name, wins, best, size)
VALUES (‘Bettye’, ‘0’, ‘193’, ‘9’);

INSERT INTO dinners (name, birthdate, entree, side, dessert)
VALUES (‘Lesley’, ‘1946-05-02’, ‘steak’, ‘salad’, ‘ice cream’);

Then, re-run the previous SELECT statement with the JOIN clause:

SELECT, tourneys.size, dinners.birthdate
FROM tourneys
JOIN dinners ON;

| name | size | birthdate |
| Dolly | 8.5 | 1946-01-19 |
| Etta | 9 | 1938-01-25 |
| Irma | 7 | 1941-02-18 |
| Barbara | 7.5 | 1948-12-25 |
| Gladys | 8 | 1944-05-28 |
5 rows in set (0.00 sec)

Notice that, because the tourneys table has no entry for Lesley and the dinners table has no entry for Bettye, those records are absent from this output.

It is possible, though, to return all the records from one of the tables using an outer JOIN clause. In MySQL, JOIN clauses are written as either LEFT JOIN or RIGHT JOIN.

A LEFT JOIN clause returns all the records from the “left” table and only the matching records from the right table. In the context of outer joins, the left table is the one referenced by the FROM clause, and the right table is any other table referenced after the JOIN statement.

Run the previous query again, but this time use a LEFT JOIN clause:

SELECT, tourneys.size, dinners.birthdate
FROM tourneys
LEFT JOIN dinners ON;

This command will return every record from the left table (in this case, tourneys) even if it doesn’t have a corresponding record in the right table. Any time there isn’t a matching record from the right table, it’s returned as NULL or just a blank value, depending on your RDBMS:

| name | size | birthdate |
| Dolly | 8.5 | 1946-01-19 |
| Etta | 9 | 1938-01-25 |
| Irma | 7 | 1941-02-18 |
| Barbara | 7.5 | 1948-12-25 |
| Gladys | 8 | 1944-05-28 |
| Bettye | 9 | NULL |
6 rows in set (0.00 sec)

Now run the query again, this time with a RIGHT JOIN clause:

SELECT, tourneys.size, dinners.birthdate
FROM tourneys
RIGHT JOIN dinners ON;

This will return all the records from the right table (dinners). Because Lesley’s birthdate is recorded in the right table, but there is no corresponding row for her in the left table, the name and size columns will return as NULL values in that row:

| name | size | birthdate |
| Dolly | 8.5 | 1946-01-19 |
| Etta | 9 | 1938-01-25 |
| Irma | 7 | 1941-02-18 |
| Barbara | 7.5 | 1948-12-25 |
| Gladys | 8 | 1944-05-28 |
| NULL | NULL | 1946-05-02 |
6 rows in set (0.00 sec)

Note that left and right joins can be written as LEFT OUTER JOIN or RIGHT OUTER JOIN, although the OUTER part of the clause is implied. Likewise, specifying INNER JOIN will produce the same result as just writing JOIN.

As an alternative to using JOIN to query records from multiple tables, you can use the UNION clause.

The UNION operator works slightly differently than a JOIN clause: instead of printing results from multiple tables as unique columns using a single SELECT statement, UNION combines the results of two SELECT statements into a single column.

To illustrate, run the following query:

SELECT name FROM tourneys UNION SELECT name FROM dinners;

This query will remove any duplicate entries, which is the default behavior of the UNION operator:

| name |
| Dolly |
| Etta |
| Irma |
| Barbara |
| Gladys |
| Bettye |
| Lesley |
7 rows in set (0.00 sec)

To return all entries (including duplicates) use the UNION ALL operator:

SELECT name FROM tourneys UNION ALL SELECT name FROM dinners;

| name |
| Dolly |
| Etta |
| Irma |
| Barbara |
| Gladys |
| Bettye |
| Dolly |
| Etta |
| Irma |
| Barbara |
| Gladys |
| Lesley |
12 rows in set (0.00 sec)

The names and number of the columns in the results table reflect the name and number of columns queried by the first SELECT statement. Note that when using UNION to query multiple columns from more than one table, each SELECT statement must query the same number of columns, the respective columns must have similar data types, and the columns in each SELECT statement must be in the same order. The following example shows what might result if you use a UNION clause on two SELECT statements that query a different number of columns:

SELECT name FROM dinners UNION SELECT name, wins FROM tourneys;

OutputERROR 1222 (21000): The used SELECT statements have a different number of columns

Another way to query multiple tables is through the use of subqueries. Subqueries (also known as inner or nested queries) are queries enclosed within another query. These are useful in cases where you’re trying to filter the results of a query against the result of a separate aggregate function.

To illustrate this idea, say you want to know which of your friends have won more matches than Barbara. Rather than querying how many matches Barbara has won then running another query to see who has won more games than that, you can calculate both with a single query:

SELECT name, wins FROM tourneys
WHERE wins > (
SELECT wins FROM tourneys WHERE name = ‘Barbara’

| name | wins |
| Dolly | 7 |
| Etta | 4 |
| Irma | 9 |
| Gladys | 13 |
4 rows in set (0.00 sec)

The subquery in this statement was run only once; it only needed to find the value from the wins column in the same row as Barbara in the name column, and the data returned by the subquery and outer query are independent of one another. There are cases, though, where the outer query must first read every row in a table and compare those values against the data returned by the subquery in order to return the desired data. In this case, the subquery is referred to as a correlated subquery.

The following statement is an example of a correlated subquery. This query seeks to find which of your friends have won more games than is the average for those with the same shoe size:

SELECT name, size FROM tourneys AS t
WHERE wins > (
SELECT AVG(wins) FROM tourneys WHERE size = t.size

In order for the query to complete, it must first collect the name and size columns from the outer query. Then, it compares each row from that result set against the results of the inner query, which determines the average number of wins for individuals with identical shoe sizes. Because you only have two friends that have the same shoe size, there can only be one row in the result set:

| name | size |
| Etta | 9 |
1 row in set (0.00 sec)

As mentioned earlier, subqueries can be used to query results from multiple tables. To illustrate this with one final example, say you wanted to throw a surprise dinner for the group’s all-time best bowler. You could find which of your friends has the best bowling record and return their favorite meal with the following query:

SELECT name, entree, side, dessert
FROM dinners
WHERE name = (SELECT name FROM tourneys
WHERE wins = (SELECT MAX(wins) FROM tourneys));

| name | entree | side | dessert |
| Gladys | steak | fries | ice cream |
1 row in set (0.00 sec)

Notice that this statement not only includes a subquery, but also contains a subquery within that subquery.


Issuing queries is one of the most commonly-performed tasks within the realm of database management. There are a number of database administration tools, such as phpMyAdmin or pgAdmin, that allow you to perform queries and visualize the results, but issuing SELECT statements from the command line is still a widely-practiced workflow that can also provide you with greater control.

Mysql2: A Modern, Simple and Very Fast MySQL Library For Ruby

Mysql2 - A modern, simple and very fast MySQL library for Ruby - binding to libmysql

The Mysql2 gem is meant to serve the extremely common use-case of connecting, querying and iterating on results. Some database libraries out there serve as direct 1:1 mappings of the already complex C APIs available. This one is not.

It also forces the use of UTF-8 [or binary] for the connection and uses encoding-aware MySQL API calls where it can.

The API consists of three classes:

Mysql2::Client - your connection to the database.

Mysql2::Result - returned from issuing a #query on the connection. It includes Enumerable.

Mysql2::Statement - returned from issuing a #prepare on the connection. Execute the statement to get a Result.


General Instructions

gem install mysql2

This gem links against MySQL's libmysqlclient library or Connector/C library, and compatible alternatives such as MariaDB. You may need to install a package such as libmariadb-dev, libmysqlclient-dev, mysql-devel, or other appropriate package for your system. See below for system-specific instructions.

By default, the mysql2 gem will try to find a copy of MySQL in this order:

  • Option --with-mysql-dir, if provided (see below).
  • Option --with-mysql-config, if provided (see below).
  • Several typical paths for mysql_config (default for the majority of users).
  • The directory /usr/local.

Configuration options

Use these options by gem install mysql2 -- [--optionA] [--optionB=argument].

--with-mysql-dir[=/path/to/mysqldir] - Specify the directory where MySQL is installed. The mysql2 gem will not use mysql_config, but will instead look at mysqldir/lib and mysqldir/include for the library and header files. This option is mutually exclusive with --with-mysql-config.

--with-mysql-config[=/path/to/mysql_config] - Specify a path to the mysql_config binary provided by your copy of MySQL. The mysql2 gem will ask this mysql_config binary about the compiler and linker arguments needed. This option is mutually exclusive with --with-mysql-dir.

--with-mysql-rpath=/path/to/mysql/lib / --without-mysql-rpath - Override the runtime path used to find the MySQL libraries. This may be needed if you deploy to a system where these libraries are located somewhere different than on your build system. This overrides any rpath calculated by default or by the options above.

--with-sanitize[=address,cfi,integer,memory,thread,undefined] - Enable sanitizers for Clang / GCC. If no argument is given, try to enable all sanitizers or fail if none are available. If a command-separated list of specific sanitizers is given, configure will fail unless they all are available. Note that the some sanitizers may incur a performance penalty, and the Address Sanitizer may require a runtime library. To see line numbers in backtraces, declare these environment variables (adjust the llvm-symbolizer path as needed for your system):

  export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-3.4
  export ASAN_OPTIONS=symbolize=1

Linux and other Unixes

You may need to install a package such as libmariadb-dev, libmysqlclient-dev, mysql-devel, or default-libmysqlclient-dev; refer to your distribution's package guide to find the particular package. The most common issue we see is a user who has the library file but is missing the header file mysql.h -- double check that you have the -dev packages installed.

Mac OS X

You may use MacPorts, Homebrew, or a native MySQL installer package. The most common paths will be automatically searched. If you want to select a specific MySQL directory, use the --with-mysql-dir or --with-mysql-config options above.

If you have not done so already, you will need to install the XCode select tools by running xcode-select --install.


Make sure that you have Ruby and the DevKit compilers installed. We recommend the Ruby Installer distribution.

By default, the mysql2 gem will download and use MySQL Connector/C from If you prefer to use a local installation of Connector/C, add the flag --with-mysql-dir=c:/mysql-connector-c-x-y-z (this path may use forward slashes).

By default, the libmysql.dll library will be copied into the mysql2 gem directory. To prevent this, add the flag --no-vendor-libmysql. The mysql2 gem will search for libmysql.dll in the following paths, in order:

  • Environment variable RUBY_MYSQL2_LIBMYSQL_DLL=C:\path\to\libmysql.dll (note the Windows-style backslashes).
  • In the mysql2 gem's own directory vendor/libmysql.dll
  • In the system's default library search paths.


Connect to a database:

# this takes a hash of options, almost all of which map directly
# to the familiar database.yml in rails
# See
client = => "localhost", :username => "root")

Then query it:

results = client.query("SELECT * FROM users WHERE group='githubbers'")

Need to escape something first?

escaped = client.escape("gi'thu\"bbe\0r's")
results = client.query("SELECT * FROM users WHERE group='#{escaped}'")

You can get a count of your results with results.count.

Finally, iterate over the results:

results.each do |row|
  # conveniently, row is a hash
  # the keys are the fields, as you'd expect
  # the values are pre-built ruby primitives mapped from their corresponding field types in MySQL
  puts row["id"] # row["id"].is_a? Integer
  if row["dne"]  # non-existent hash entry is nil
    puts row["dne"]

Or, you might just keep it simple:

client.query("SELECT * FROM users WHERE group='githubbers'").each do |row|
  # do something with row, it's ready to rock

How about with symbolized keys?

client.query("SELECT * FROM users WHERE group='githubbers'", :symbolize_keys => true).each do |row|
  # do something with row, it's ready to rock

You can get the headers, columns, and the field types in the order that they were returned by the query like this:

headers = results.fields # <= that's an array of field names, in order
types = results.field_types # <= that's an array of field types, in order
results.each(:as => :array) do |row|
  # Each row is an array, ordered the same as the query results
  # An otter's den is called a "holt" or "couch"

Prepared statements are supported, as well. In a prepared statement, use a ? in place of each value and then execute the statement to retrieve a result set. Pass your arguments to the execute method in the same number and order as the question marks in the statement. Query options can be passed as keyword arguments to the execute method.

Be sure to read about the known limitations of prepared statements at

statement = @client.prepare("SELECT * FROM users WHERE login_count = ?")
result1 = statement.execute(1)
result2 = statement.execute(2)

statement = @client.prepare("SELECT * FROM users WHERE last_login >= ? AND location LIKE ?")
result = statement.execute(1, "CA")

statement = @client.prepare("SELECT * FROM users WHERE last_login >= ? AND location LIKE ?")
result = statement.execute(1, "CA", :as => :array)

Session Tracking information can be accessed with

c =
  host: "",
  username: "root",
  flags: "SESSION_TRACK",
  init_command: "SET @@SESSION.session_track_schema=ON"
c.query("INSERT INTO test VALUES (1)")
session_track_type = Mysql2::Client::SESSION_TRACK_SCHEMA
session_track_data = c.session_track(session_track_type)

The types of session track types can be found at

Connection options

You may set the following connection options in
  :socket = '/path/to/mysql.sock',
  :encoding = 'utf8',
  :read_timeout = seconds,
  :write_timeout = seconds,
  :connect_timeout = seconds,
  :connect_attrs = {:program_name => $PROGRAM_NAME, ...},
  :reconnect = true/false,
  :local_infile = true/false,
  :secure_auth = true/false,
  :ssl_mode = :disabled / :preferred / :required / :verify_ca / :verify_identity,
  :default_file = '/path/to/my.cfg',
  :default_group = 'my.cfg section',
  :default_auth = 'authentication_windows_client'
  :init_command => sql

Connecting to MySQL on localhost and elsewhere

The underlying MySQL client library uses the :host parameter to determine the type of connection to make, with special interpretation you should be aware of:

  • An empty value or "localhost" will attempt a local connection:
    • On Unix, connect to the default local socket path. (To set a custom socket path, use the :socket parameter).
    • On Windows, connect using a shared-memory connection, if enabled, or TCP.
  • A value of "." on Windows specifies a named-pipe connection.
  • An IPv4 or IPv6 address will result in a TCP connection.
  • Any other value will be looked up as a hostname for a TCP connection.

SSL options

Setting any of the following options will enable an SSL connection, but only if your MySQL client library and server have been compiled with SSL support. MySQL client library defaults will be used for any parameters that are left out or set to nil. Relative paths are allowed, and may be required by managed hosting providers such as Heroku. Set :sslverify => true to require that the server presents a valid certificate.
  # ...options as above...,
  :sslkey => '/path/to/client-key.pem',
  :sslcert => '/path/to/client-cert.pem',
  :sslca => '/path/to/ca-cert.pem',
  :sslcapath => '/path/to/cacerts',
  :sslcipher => 'DHE-RSA-AES256-SHA',
  :sslverify => true,

Secure auth

Starting with MySQL 5.6.5, secure_auth is enabled by default on servers (it was disabled by default prior to this). When secure_auth is enabled, the server will refuse a connection if the account password is stored in old pre-MySQL 4.1 format. The MySQL 5.6.5 client library may also refuse to attempt a connection if provided an older format password. To bypass this restriction in the client, pass the option :secure_auth => false to

Flags option parsing

The :flags parameter accepts an integer, a string, or an array. The integer form allows the client to assemble flags from constants defined under Mysql2::Client such as Mysql2::Client::FOUND_ROWS. Use a bitwise | (OR) to specify several flags.

The string form will be split on whitespace and parsed as with the array form: Plain flags are added to the default flags, while flags prefixed with - (minus) are removed from the default flags.

Using Active Record's database.yml

Active Record typically reads its configuration from a file named database.yml or an environment variable DATABASE_URL. Use the value mysql2 as the adapter name. For example:

  adapter: mysql2
  encoding: utf8
  database: my_db_name
  username: root
  password: my_password
  port: 3306
  secure_auth: false

In this example, the compression flag is negated with -COMPRESS.

Using Active Record's DATABASE_URL

Active Record typically reads its configuration from a file named database.yml or an environment variable DATABASE_URL. Use the value mysql2 as the protocol name. For example:


Reading a MySQL config file

You may read configuration options from a MySQL configuration file by passing the :default_file and :default_group parameters. For example: => '/user/.my.cnf', :default_group => 'client')

Initial command on connect and reconnect

If you specify the :init_command option, the SQL string you provide will be executed after the connection is established. If :reconnect is set to true, init_command will also be executed after a successful reconnect. It is useful if you want to provide session options which survive reconnection. => "SET @@SESSION.sql_mode = 'STRICT_ALL_TABLES'")

Multiple result sets

You can also retrieve multiple result sets. For this to work you need to connect with flags Mysql2::Client::MULTI_STATEMENTS. Multiple result sets can be used with stored procedures that return more than one result set, and for bundling several SQL statements into a single call to client.query.

client = => "localhost", :username => "root", :flags => Mysql2::Client::MULTI_STATEMENTS)
result = client.query('CALL sp_customer_list( 25, 10 )')
# result now contains the first result set
while client.next_result
  result = client.store_result
  # result now contains the next result set

Repeated calls to client.next_result will return true, false, or raise an exception if the respective query erred. When client.next_result returns true, call client.store_result to retrieve a result object. Exceptions are not raised until client.next_result is called to find the status of the respective query. Subsequent queries are not executed if an earlier query raised an exception. Subsequent calls to client.next_result will return false.

result = client.query('SELECT 1; SELECT 2; SELECT A; SELECT 3')
p result.first

while client.next_result
  result = client.store_result
  p result.first


next_result: Unknown column 'A' in 'field list' (Mysql2::Error)

Cascading config

The default config hash is at:


which defaults to:

{:async => false, :as => :hash, :symbolize_keys => false}

that can be used as so:

# these are the defaults all Mysql2::Client instances inherit
Mysql2::Client.default_query_options.merge!(:as => :array)


# this will change the defaults for all future results returned by the #query method _for this connection only_
c =
c.query_options.merge!(:symbolize_keys => true)


# this will set the options for the Mysql2::Result instance returned from the #query method
c =
c.query(sql, :symbolize_keys => true)


# this will set the options for the Mysql2::Result instance returned from the #execute method
c =
s = c.prepare(sql)
s.execute(arg1, args2, :symbolize_keys => true)

Result types

Array of Arrays

Pass the :as => :array option to any of the above methods of configuration

Array of Hashes

The default result type is set to :hash, but you can override a previous setting to something else with :as => :hash


Mysql2 now supports two timezone options:

:database_timezone # this is the timezone Mysql2 will assume fields are already stored as, and will use this when creating the initial Time objects in ruby
:application_timezone # this is the timezone Mysql2 will convert to before finally handing back to the caller

In other words, if :database_timezone is set to :utc - Mysql2 will create the Time objects using Time.utc(...) from the raw value libmysql hands over initially. Then, if :application_timezone is set to say - :local - Mysql2 will then convert the just-created UTC Time object to local time.

Both options only allow two values - :local or :utc - with the exception that :application_timezone can be [and defaults to] nil

Casting "boolean" columns

You can now tell Mysql2 to cast tinyint(1) fields to boolean values in Ruby with the :cast_booleans option.

client =
result = client.query("SELECT * FROM table_with_boolean_field", :cast_booleans => true)

Keep in mind that this works only with fields and not with computed values, e.g. this result will contain 1, not true:

client =
result = client.query("SELECT true", :cast_booleans => true)

CAST function wouldn't help here as there's no way to cast to TINYINT(1). Apparently the only way to solve this is to use a stored procedure with return type set to TINYINT(1).

Skipping casting

Mysql2 casting is fast, but not as fast as not casting data. In rare cases where typecasting is not needed, it will be faster to disable it by providing :cast => false. (Note that :cast => false overrides :cast_booleans => true.)

client =
result = client.query("SELECT * FROM table", :cast => false)

Here are the results from the query_without_mysql_casting.rb script in the benchmarks folder:

                           user     system      total        real
Mysql2 (cast: true)    0.340000   0.000000   0.340000 (  0.405018)
Mysql2 (cast: false)   0.160000   0.010000   0.170000 (  0.209937)
Mysql                  0.080000   0.000000   0.080000 (  0.129355)
do_mysql               0.520000   0.010000   0.530000 (  0.574619)

Although Mysql2 performs reasonably well at retrieving uncasted data, it (currently) is not as fast as the Mysql gem. In spite of this small disadvantage, Mysql2 still sports a friendlier interface and doesn't block the entire ruby process when querying.


NOTE: Not supported on Windows.

Mysql2::Client takes advantage of the MySQL C API's (undocumented) non-blocking function mysql_send_query for all queries. But, in order to take full advantage of it in your Ruby code, you can do:

client.query("SELECT sleep(5)", :async => true)

Which will return nil immediately. At this point you'll probably want to use some socket monitoring mechanism like EventMachine or even Once the socket becomes readable, you can do:

# result will be a Mysql2::Result instance
result = client.async_result

NOTE: Because of the way MySQL's query API works, this method will block until the result is ready. So if you really need things to stay async, it's best to just monitor the socket with something like EventMachine. If you need multiple query concurrency take a look at using a connection pool.

Row Caching

By default, Mysql2 will cache rows that have been created in Ruby (since this happens lazily). This is especially helpful since it saves the cost of creating the row in Ruby if you were to iterate over the collection again.

If you only plan on using each row once, then it's much more efficient to disable this behavior by setting the :cache_rows option to false. This would be helpful if you wanted to iterate over the results in a streaming manner. Meaning the GC would cleanup rows you don't need anymore as you're iterating over the result set.


Mysql2::Client can optionally only fetch rows from the server on demand by setting :stream => true. This is handy when handling very large result sets which might not fit in memory on the client.

result = client.query("SELECT * FROM really_big_Table", :stream => true)

There are a few things that need to be kept in mind while using streaming:

  • :cache_rows is ignored currently. (if you want to use :cache_rows you probably don't want to be using :stream)
  • You must fetch all rows in the result set of your query before you can make new queries. (i.e. with Mysql2::Result#each)

Read more about the consequences of using mysql_use_result (what streaming is implemented with) here:

Lazy Everything

Well... almost ;)

Field name strings/symbols are shared across all the rows so only one object is ever created to represent the field name for an entire dataset.

Rows themselves are lazily created in ruby-land when an attempt to yield it is made via #each. For example, if you were to yield 4 rows from a 100 row dataset, only 4 hashes will be created. The rest will sit and wait in C-land until you want them (or when the GC goes to cleanup your Mysql2::Result instance). Now say you were to iterate over that same collection again, this time yielding 15 rows - the 4 previous rows that had already been turned into ruby hashes would be pulled from an internal cache, then 11 more would be created and stored in that cache. Once the entire dataset has been converted into ruby objects, Mysql2::Result will free the Mysql C result object as it's no longer needed.

This caching behavior can be disabled by setting the :cache_rows option to false.

As for field values themselves, I'm workin on it - but expect that soon.


This gem is tested with the following Ruby versions on Linux and Mac OS X:

  • Ruby MRI 2.0.0, 2.1.x, 2.2.x, 2.3.x, 2.4.x, 2.5.x, 2.6.x
  • Rubinius 2.x and 3.x do work but may fail under some workloads

This gem is tested with the following MySQL and MariaDB versions:

  • MySQL 5.5, 5.6, 5.7, 8.0
  • MySQL Connector/C 6.0 and 6.1 (primarily on Windows)
  • MariaDB 5.5, 10.0, 10.1, 10.2, 10.3

Ruby on Rails / Active Record

  • mysql2 0.5.x works with Rails / Active Record 4.2.11, 5.0.7, 5.1.6, and higher.
  • mysql2 0.4.x works with Rails / Active Record 4.2.5 - 5.0 and higher.
  • mysql2 0.3.x works with Rails / Active Record 3.1, 3.2, 4.x, 5.0.
  • mysql2 0.2.x works with Rails / Active Record 2.3 - 3.0.

Asynchronous Active Record

Please see the em-synchrony project for details about using EventMachine with mysql2 and Rails.


Sequel includes a mysql2 adapter in all releases since 3.15 (2010-09-01). Use the prefix "mysql2://" in your connection specification.


The mysql2 EventMachine deferrable api allows you to make async queries using EventMachine, while specifying callbacks for success for failure. Here's a simple example:

require 'mysql2/em' do
  client1 =
  defer1 = client1.query "SELECT sleep(3) as first_query"
  defer1.callback do |result|
    puts "Result: #{result.to_a.inspect}"

  client2 =
  defer2 = client2.query "SELECT sleep(1) second_query"
  defer2.callback do |result|
    puts "Result: #{result.to_a.inspect}"

Benchmarks and Comparison

The mysql2 gem converts MySQL field types to Ruby data types in C code, providing a serious speed benefit.

The do_mysql gem also converts MySQL fields types, but has a considerably more complex API and is still ~2x slower than mysql2.

The mysql gem returns only nil or string data types, leaving you to convert field values to Ruby types in Ruby-land, which is much slower than mysql2's C code.

For a comparative benchmark, the script below performs a basic "SELECT * FROM" query on a table with 30k rows and fields of nearly every Ruby-representable data type, then iterating over every row using an #each like method yielding a block:

         user       system     total       real
Mysql2   0.750000   0.180000   0.930000   (1.821655)
do_mysql 1.650000   0.200000   1.850000   (2.811357)
Mysql    7.500000   0.210000   7.710000   (8.065871)

These results are from the query_with_mysql_casting.rb script in the benchmarks folder.


Use 'bundle install' to install the necessary development and testing gems:

bundle install

The tests require the "test" database to exist, and expect to connect both as root and the running user, both with a blank password:

CREATE USER '<user>'@'localhost' IDENTIFIED BY '';
GRANT ALL PRIVILEGES ON test.* TO '<user>'@'localhost';

You can change these defaults in the spec/configuration.yml which is generated automatically when you run rake (or explicitly rake spec/configuration.yml).

For a normal installation on a Mac, you most likely do not need to do anything, though.

Special Thanks

  • Eric Wong - for the contribution (and the informative explanations) of some thread-safety, non-blocking I/O and cleanup patches. You rock dude
  • Yury Korolev - for TONS of help testing the Active Record adapter
  • Aaron Patterson - tons of contributions, suggestions and general badassness
  • Mike Perham - Async Active Record adapter (uses Fibers and EventMachine)
  • Aaron Stone - additional client settings, local files, microsecond time, maintenance support
  • Kouhei Ueno - for the original work on Prepared Statements way back in 2012
  • John Cant - polishing and updating Prepared Statements support
  • Justin Case - polishing and updating Prepared Statements support and getting it merged
  • Tamir Duberstein - for help with timeouts and all around updates and cleanups
  • Jun Aruga - for migrating CI tests to GitHub Actions and other improvements

Author: brianmario
Source Code:
License: MIT License

#mysql #ruby