Learn MongoDB - MongoDB Tutorial for Beginners - Getting Started with MongoDB - Part 1/3

Learn MongoDB - MongoDB Tutorial for Beginners - Getting Started with MongoDB

What you’ll learn

  • Work with MongoDB with Clarity and Confidence
  • Use 4 tools MongoCHEF, NOSQL Manager, RoboMongo, MongoBooster easily
  • Do Regex, GridFS , Replication , Sharding, Full text search
  • Basic and Advanced CRUD operations using MongoDB
  • Import and Export data from MongoDB
  • Work MapReduce, Embedded Documents,Save&Insert , indexing, capped collections, TTL
  • Bonus section * Use java,C#,PHP,Nodejs to access MongoDB features like CRUD, GridFS
  • Bonus Section * A 50 minutes MongoDB key feature exercises
  • 100+ Quizzes 40+ Activities


Learn More

MongoDB - The Complete Developer’s Guide

The Complete Developers Guide to MongoDB

MongoDB - The Complete Developer’s Guide

Building A REST API With MongoDB, Mongoose, And Node.js

Node.js, ExpressJs, MongoDB and Vue.js (MEVN Stack) Application Tutorial

MEAN Stack Tutorial MongoDB, ExpressJS, AngularJS and NodeJS

MongoDB with Python Crash Course - Tutorial for Beginners

SQL vs NoSQL or MongoDB vs MySQL? Which database is better?

SQL vs NoSQL or MongoDB vs MySQL? Which database is better?

In this post, we will understand the difference between NoSQL vs SQL, MySQL vs MongoDB Database. Which is better SQL or NoSQL?

When it comes to choosing a database, one of the biggest decisions is picking a relational (SQL) or non-relational (NoSQL) data structure. While both are viable options, there are certain key differences between the two that users must keep in mind when making a decision.

Here, we break down the most important distinctions and discuss two of the key players in the relational vs non-relational debate: MySQL and MongoDB.

SQL vs NoSQL: Differences Between SQL and NoSQL

  • SQL databases are primarily called as Relational Databases (RDBMS); whereas NoSQL database are primarily called as non-relational or distributed database.

  • SQL databases are table based databases whereas NoSQL databases are document based, key-value pairs, graph databases or wide-column stores. This means that SQL databases represent data in form of tables which consists of n number of rows of data whereas NoSQL databases are the collection of key-value pair, documents, graph databases or wide-column stores which do not have standard schema definitions which it needs to adhered to.

  • SQL databases have predefined schema whereas NoSQL databases have dynamic schema for unstructured data.

  • SQL databases are vertically scalable whereas the NoSQL databases are horizontally scalable. SQL databases are scaled by increasing the horse-power of the hardware. NoSQL databases are scaled by increasing the databases servers in the pool of resources to reduce the load.

  • SQL databases uses SQL ( structured query language ) for defining and manipulating the data, which is very powerful. In NoSQL database, queries are focused on collection of documents. Sometimes it is also called as UnQL (Unstructured Query Language). The syntax of using UnQL varies from database to database.

  • SQL database examples: MySql, Oracle, Sqlite, Postgres and MS-SQL. NoSQL database examples: MongoDB, BigTable, Redis, RavenDb, Cassandra, Hbase, Neo4j and CouchDb

  • For complex queries: SQL databases are good fit for the complex query intensive environment whereas NoSQL databases are not good fit for complex queries. On a high-level, NoSQL don’t have standard interfaces to perform complex queries, and the queries themselves in NoSQL are not as powerful as SQL query language.

  • For the type of data to be stored: SQL databases are not best fit for hierarchical data storage. But, NoSQL database fits better for the hierarchical data storage as it follows the key-value pair way of storing data similar to JSON data. NoSQL database are highly preferred for large data set (i.e for big data). Hbase is an example for this purpose.

  • For scalability: In most typical situations, SQL databases are vertically scalable. You can manage increasing load by increasing the CPU, RAM, SSD, etc, on a single server. On the other hand, NoSQL databases are horizontally scalable. You can just add few more servers easily in your NoSQL database infrastructure to handle the large traffic.

  • For high transactional based application: SQL databases are best fit for heavy duty transactional type applications, as it is more stable and promises the atomicity as well as integrity of the data. While you can use NoSQL for transactions purpose, it is still not comparable and sable enough in high load and for complex transactional applications.

  • For support: Excellent support are available for all SQL database from their vendors. There are also lot of independent consultations who can help you with SQL database for a very large scale deployments. For some NoSQL database you still have to rely on community support, and only limited outside experts are available for you to setup and deploy your large scale NoSQL deployments.

  • For properties: SQL databases emphasizes on ACID properties ( Atomicity, Consistency, Isolation and Durability) whereas the NoSQL database follows the Brewers CAP theorem ( Consistency, Availability and Partition tolerance )

  • For DB types: On a high-level, we can classify SQL databases as either open-source or close-sourced from commercial vendors. NoSQL databases can be classified on the basis of way of storing data as graph databases, key-value store databases, document store databases, column store database and XML databases.

MySQL and MongoDB: Which database is better?

MongoDB

The following are some of the benefits and strengths of MongoDB:

  • Free to use: Since October 2018, MongoDB's updates have been published under the Server Side Public License (SSPL) v1, and the database is free to use.

  • Dynamic schema: As mentioned, this gives you the flexibility to change your data schema without modifying any of your existing data.

  • Scalability: MongoDB is horizontally scalable, which helps reduce the workload and scale your business with ease.

  • Manageability: The database doesn’t require a database administrator. Since it is fairly user-friendly in this way, it can be used by both developers and administrators.

  • Speed: It’s high-performing for simple queries.

  • Flexibility: You can add new columns or fields on MongoDB without affecting existing rows or application performance.

  • ACID Transactions: MongoDB v.4 is finally getting support for multi-document ACID (atomicity, consistency, isolation, durability) transactions. That’s something the MongoDB community has been asking for for years and MongoDB Inc, the company behind the project, is now about to make this a reality.

  • MongoDB Atlas (a new feature): MongoDB recently added MongoDB Atlas global cloud database technology to its offerings. This feature allows you to deploy fully-managed MongoDB via AWS, Azure, or GCP. MongoDB Atlas lets you use drivers, integrations, and tools to reduce the time required to manage your database. Here's the pricing information from Atlas.

  • Who Should Use It? MongoDB is a good choice for businesses that have rapid growth or databases with no clear schema definitions (i.e., you have a lot of unstructured data). If you cannot define a schema for your database, if you find yourself denormalizing data schemas, or if your data requirements and schemas are constantly evolving - as is often the case with mobile apps, real-time analytics, content management systems, etc. - MongoDB can be a strong choice for you.

MySQL

Here are some MySQL benefits and strengths:

  • Owned by Oracle: Although MySQL is free and open-source, the database system is owned and managed by Oracle.

  • Maturity: MySQL is an extremely established database, meaning that there’s a huge community, extensive testing and quite a bit of stability.

  • Compatibility: MySQL is available for all major platforms, including Linux, Windows, Mac, BSD, and Solaris. It also has connectors to languages like Node.js, Ruby, C#, C++, Java, Perl, Python, and PHP, meaning that it’s not limited to SQL query language.

  • Cost-effective: The database is open-source and free.

  • Replicable: The MySQL database can be replicated across multiple nodes, meaning that the workload can be reduced and the scalability and availability of the application can be increased.

  • Sharding: While sharding cannot be done on most SQL databases, it can be done on MySQL servers. This is both cost-effective and good for business.

  • Who Should Use It? MySQL is a strong choice for any business that will benefit from its pre-defined structure and set schemas. For example, applications that require multi-row transactions - like accounting systems or systems that monitor inventory - or that run on legacy systems will thrive with the MySQL structure.

Which is better SQL or NoSQL?

In fact, every database has its unique advantages. No database offers the best solution, only the most suitable option for each project.

Querying MongoDB Like an SQL DB when using Aggregation Pipeline

Querying MongoDB Like an SQL DB when using Aggregation Pipeline

All you need to know to use SQL SELECT, GROUP, JOIN, LIMIT, and OFFSET queries in MongoDB like a boss

What Are Aggregations?

Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together and can perform a variety of operations on the grouped data to return a single result.

In the db.collection.aggregate method and [db.aggregate](https://docs.mongodb.com/manual/reference/method/db.aggregate/#db.aggregate) method, pipeline stages appear in an array. Documents pass through the stages in sequence. We will go through some of the stages to achieve a relational DB like results.

$match (WHERE)

Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.

It has the following prototype:

{ $match: { <query> } }

It is the equivalent of WHERE in SQL queries. Let us take an example to make things clear. This example uses a collection named articles with the following documents:

{ "_id" : ObjectId("512bc95fe835e68f199c8686"), "author" : "dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("512bc962e835e68f199c8687"), "author" : "dave", "score" : 85, "views" : 521 }
{ "_id" : ObjectId("55f5a192d4bede9ac365b257"), "author" : "ahn", "score" : 60, "views" : 1000 }
{ "_id" : ObjectId("55f5a192d4bede9ac365b258"), "author" : "li", "score" : 55, "views" : 5000 }
{ "_id" : ObjectId("55f5a1d3d4bede9ac365b259"), "author" : "annT", "score" : 60, "views" : 50 }
{ "_id" : ObjectId("55f5a1d3d4bede9ac365b25a"), "author" : "li", "score" : 94, "views" : 999 }
{ "_id" : ObjectId("55f5a1d3d4bede9ac365b25b"), "author" : "ty", "score" : 95, "views" : 1000 }

Equality match.

db.articles.aggregate(
    [ { $match : { author : "dave" } } ]
);

// Result
{ "_id" : ObjectId("512bc95fe835e68f199c8686"), "author" : "dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("512bc962e835e68f199c8687"), "author" : "dave", "score" : 85, "views" : 521 }

We can have multiple constraints inside $match, like $or, $and, etc. according to our requirements, although it has some limitations as well. You can read about it in the Mongo docs.

$skip (OFFSET)

Skips over the specified number of documents that pass into the stage and passes the remaining documents to the next stage in the pipeline. It has the following prototype:

{ $skip: <positive integer> }

In the above example, we were able to match the records related to dave. If we want to skip the few results from the beginning we would write the query as:

db.articles.aggregate([ 
    { $match : { author : "dave" } },
    { $skip: 1 } 
]);

// We are skipping 1 result and we should get just this

{ "_id" : ObjectId("5dc1d22f24a8e913bfcf4f60"), "author" : "dave", "score" : 85, "views" : 521 }

We generally use $skip with $limit to paginate the data, let’s insert few more records into our collection and see how $skip and $limit work together in the next section.

$limit (LIMIT)

Limits the number of documents passed to the next stage in the pipeline. It has the following prototype:

{ $limit: <positive integer> }

We have added new records for dave, let’s see how the collections look with a simple $match:

db.articles.aggregate(
    [ { $match : { author : "dave" } } ]
);

{ "_id" : ObjectId("5dc1d22124a8e913bfcf4f5f"), "author" : "dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("5dc1d22f24a8e913bfcf4f60"), "author" : "dave", "score" : 85, "views" : 521 }
{ "_id" : ObjectId("5dc1d53924a8e913bfcf4f65"), "author" : "dave", "score" : 185, "views" : 1521 }
{ "_id" : ObjectId("5dc1d54f24a8e913bfcf4f66"), "author" : "dave", "score" : 15, "views" : 21 }

We have four matching records in the example collection. Suppose you are asked to paginate the results to show two at a time, how would you go about it? Let’s see.

db.articles.aggregate([ 
    { $match : { author : "dave" } },
    { $skip: 0},
    { $limit: 2}
]);

// We are not skipping any records and but limiting the records to 2

{ "_id" : ObjectId("5dc1d22124a8e913bfcf4f5f"), "author" : "dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("5dc1d22f24a8e913bfcf4f60"), "author" : "dave", "score" : 85, "views" : 521 }

// We got the first two results, to get the next two results just update the $skip

db.articles.aggregate([ 
    { $match : { author : "dave" } },
    { $skip: 2},
    { $limit: 2}
]);

// This should give two records after skipping the first two.

{ "_id" : ObjectId("5dc1d53924a8e913bfcf4f65"), "author" : "dave", "score" : 185, "views" : 1521 }
{ "_id" : ObjectId("5dc1d54f24a8e913bfcf4f66"), "author" : "dave", "score" : 15, "views" : 21 }

We are doing good so far, but suppose your manager comes up to you and asks you to sort the result by views. What are you going to do? We have $sort for that.

$sort (ORDER BY)

Sorts all input documents and returns them to the pipeline in sorted order. It has the following prototype:

{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }

Let us use the $sort stage in our pipeline.

1 to specify ascending order
-1 to specify descending order

db.articles.aggregate([ 
    { $match : { author : "dave" } },
    { $sort: { views: 1}}
]);

// Result

{ "_id" : ObjectId("5dc1d54f24a8e913bfcf4f66"), "author" : "dave", "score" : 15, "views" : 21 }
{ "_id" : ObjectId("5dc1d22124a8e913bfcf4f5f"), "author" : "dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("5dc1d22f24a8e913bfcf4f60"), "author" : "dave", "score" : 85, "views" : 521 }
{ "_id" : ObjectId("5dc1d53924a8e913bfcf4f65"), "author" : "dave", "score" : 185, "views" : 1521 }

Voilà ! The results are sorted now.

Place the $match as early in the aggregation pipeline as possible. Because $match limits the total number of documents in the aggregation pipeline, earlier $match operations minimize the amount of processing down the pipe.

$group

Groups input documents by the specified _id expression and, for each distinct grouping, outputs a document.

The _id field of each output document contains the unique group by value. The output documents can also contain computed fields that hold the values of an accumulator expression. It has the following prototype:

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },
      ...
    }
}

Suppose we want to group the articles by author, in other words, the number of articles by each author, we can make use of the group stage in the pipeline. So, let’s see it live:

db.articles.aggregate([ 
    { $group : { _id: "$author", count: { $sum: 1 }}},
    { $sort: { count: 1 }}
]);

// We have grouped the articles by author ann getting the count and sorting it by count

{ "_id" : "annT", "count" : 1 }
{ "_id" : "ahn", "count" : 1 }
{ "_id" : "li", "count" : 2 }
{ "_id" : "dave", "count" : 4 }

// We can have more constraints like if we want only the results whose count is greater than 1, then we can add a $match stage in the pipeline after $group

db.articles.aggregate([ 
    { $group : { _id: "$author", count: { $sum: 1 }}},
    { $sort: { count: 1 }},
    { $match: { count : { $gt: 1 }}}
]);

{ "_id" : "li", "count" : 2 }
{ "_id" : "dave", "count" : 4 }

Let us take this grouping up a notch. Suppose we want to group by values stored in an array structure. We have something called $unwind. Let’s see how it works.

$unwind

Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.

You can pass the array field path to $unwind. When using this syntax, $unwind does not output a document if the field value is null, missing, or an empty array. It has the following prototype:

{ $unwind: <field path> }

Let us take a new collection inventory and a new record to it with the following command:

db.inventory.insertOne({ "_id" : 1, "item" : "ABC1", sizes: [ "S", "M", "L"] })

That’s the beauty of MongoDB, you can create a new collection and each document is identical to the input document, except for the value of the sizes field which now holds a value from the original sizes array. Add a record to it without any setup.

Let us $unwind this by the sizes.

db.inventory.aggregate( [ { $unwind : "$sizes" } ] )

// Result
{ "_id" : 1, "item" : "ABC1", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC1", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC1", "sizes" : "L" }

Each document is identical to the input document except for the value of the sizes field which now holds a value from the original sizes array.

Let us take a new collection inventory2 and do a group by the size, use this command to insert more records:

db.inventory2.insertMany([
  { "_id" : 1, "item" : "ABC", price: NumberDecimal("80"), "sizes": [ "S", "M", "L"] },
  { "_id" : 2, "item" : "EFG", price: NumberDecimal("120"), "sizes" : [ ] },
  { "_id" : 3, "item" : "IJK", price: NumberDecimal("160"), "sizes": "M" },
  { "_id" : 4, "item" : "LMN" , price: NumberDecimal("10") },
  { "_id" : 5, "item" : "XYZ", price: NumberDecimal("5.75"), "sizes" : null }
])

If we unwind this, we would get something like this:

db.inventory2.aggregate( [ { $unwind: "$sizes" } ] )

// Results

{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "price" : NumberDecimal("80"), "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "price" : NumberDecimal("160"), "sizes" : "M" }

// Notice it ignores the null and undefined values

db.articles.aggregate([ 
    { $unwind: { path: "$sizes" } },
    { $group: { _id: "$sizes", count: { $sum: 1 }}}
]);

// Results

{ "_id" : "M", "count" : 2 }
{ "_id" : "L", "count" : 1 }
{ "_id" : "S", "count" : 1 }

We can apply different stages to this like $match, $sort, $skip, $limit, etc. to get the desired results.

Now, let’s move on to SQL JOINS, to achieve joins in MongoDB we have $lookup.

$lookup

New in version 3.2.

Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing.

To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.

There can be different join conditions but we will be looking into the most basic one, which is an equality match.

Equality match

To perform uncorrelated subqueries between two collections as well as allow other join conditions besides a single equality match. The [$lookup](https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#pipe._S_lookup) stage has the following syntax:

{
   $lookup:
     {
       from: <collection to join>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to execute on the collection to join> ],
       as: <output array field>
     }
}

from: Specifies the collection in the same database to perform the join with.

let: Optional. Specifies variables to use in the pipeline field stages. Use the variable expressions to access the fields from the documents input to the $lookup stage.

pipeline: Specifies the pipeline to run on the joined collection. The pipeline determines the resulting documents from the joined collection. To return all documents, specify an empty pipeline [].

as: Specifies the name of the new array field to add to the input documents. The new array field contains the matching documents from the from collection. If the specified name already exists in the input document, the existing field is overwritten.

Let us look at some examples to better understand the terminologies:

Perform a single equality join with $lookup

Create a collection orders with the following documents:

db.orders.insert([
   { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
   { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
   { "_id" : 3  }
])

Create another collection inventory with the following documents:

db.inventory.insert([
   { "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 },
   { "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 },
   { "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 },
   { "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 },
   { "_id" : 5, "sku": null, description: "Incomplete" },
   { "_id" : 6 }
])

The following aggregation operation on the orders collection joins the documents from orders with the documents from the inventory collection using the fields item from the orders collection and the sku field from the inventory collection:

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }
]);

The operation returns the following documents:

{
   "_id" : 1,
   "item" : "almonds",
   "price" : 12,
   "quantity" : 2,
   "inventory_docs" : [
      { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
   ]
}
{
   "_id" : 2,
   "item" : "pecans",
   "price" : 20,
   "quantity" : 1,
   "inventory_docs" : [
      { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
   ]
}
{
   "_id" : 3,
   "inventory_docs" : [
      { "_id" : 5, "sku" : null, "description" : "Incomplete" },
      { "_id" : 6 }
   ]
}
Conclusion

This was just a basic overview of using SQL-like queries in MongoDB.

There is a lot more that can be done using many other stages that are available. The best way to have a good grasp of it is by practicing different scenarios and using them in your projects. I hope this will help. Thank you !

Learn NoSQL Databases from Scratch - Complete MongoDB Bootcamp 2019

In this video, you will spend the first section learning the basic concepts of NoSQL followed by a short introduction to MongoDB, MongoDB Server installation. Once the theory sections has been learned and understood you will move through a wide range of topics centered around NoSQL and MongoDB. By following this course you will learn to use a MongoDB in a real setting using the Java programming language which is an skillset in high-demand at top companies.

In this video, you will spend the first section learning the basic concepts of NoSQL followed by a short introduction to MongoDB, MongoDB Server installation. Once the theory sections has been learned and understood you will move through a wide range of topics centered around NoSQL and MongoDB. By following this course you will learn to use a MongoDB in a real setting using the Java programming language which is an skillset in high-demand at top companies.