MongoDB Schema Design Anti-Pattern: Separating Data That is Accessed Together

MongoDB Schema Design Anti-Pattern: Separating Data That is Accessed Together

Normalizing data and splitting it into different pieces to optimize for space and reduce data duplication can feel like second nature to those with a relational database background. However, separating data that is frequently accessed together is actually an anti-pattern in MongoDB. In this post, we'll find out why and discuss what you should do instead.

Normalizing data and splitting it into different pieces to optimize for space and reduce data duplication can feel like second nature to those with a relational database background. However, separating data that is frequently accessed together is actually an anti-pattern in MongoDB. In this post, we'll find out why and discuss what you should do instead.

Separating Data That is Accessed Together

Much like you would use a join to combine information from different tables in a relational database, MongoDB has a $lookup operation that allows you to join information from more than one collection$lookup is great for infrequent, rarely used operations or analytical queries that can run overnight without a time limit. However, $lookup is not so great when you're frequently using it in your applications. Why?

$lookup operations are slow and resource-intensive compared to operations that don't need to combine data from more than one collection.

The rule of thumb when modeling your data in MongoDB is:

Data that is accessed together should be stored together.

Instead of separating data that is frequently used together between multiple collections, leverage embedding and arrays to keep the data together in a single collection.

For example, when modeling a one-to-one relationship, you can embed a document from one collection as a subdocument in a document from another. When modeling a one-to-many relationship, you can embed information from multiple documents in one collection as an array of documents in another.

Keep in mind the other anti-patterns we've already discussed as you begin combining data from different collections together. Massive, unbounded arrays and bloated documents can both be problematic.

If combining data from separate collections into a single collection will result in massive, unbounded arrays or bloated documents, you may want to keep the collections separate and duplicate some of the data that is used frequently together in both collections. You could use the Subset Pattern to duplicate a subset of the documents from one collection in another. You could also use the Extended Reference Pattern to duplicate a portion of the data in each document from one collection in another. In both patterns, you have the option of creating references between the documents in both collections. Keep in mind that whenever you need to combine information from both collections, you'll likely need to use $lookup. Also, whenever you duplicate data, you are responsible for ensuring the duplicated data stays in sync.

As we have said throughout this series, each use case is different. As you model your schema, carefully consider how you will be querying the data and what the data you will be storing will realistically look like.

Example

What would an Anti-Pattern post be without an example from Parks and Recreation? I don't even want to think about it. So let's return to Leslie.

Leslie decides to organize a Model United Nations for local high school students and recruits some of her coworkers to participate as well. Each participant will act as a delegate for a country during the event. She assigns Andy and Donna to be delegates for Finland.

Leslie decides to store information related to the Model United Nations in a MongoDB database. She wants to store the following information in her database:

  • Basic stats about each country
  • A list of resources that each country has available to trade
  • A list of delegates for each country
  • Policy statements for each country
  • Information about each Model United Nations event she runs

With this information, she wants to be able to quickly generate the following reports:

  • A country report that contains basic stats, resources currently available to trade, a list of delegates, the names and dates of the last five policy documents, and a list of all of the Model United Nations events in which this country has participated
  • An event report that contains information about the event and the names of the countries who participated

The Model United Nations event begins, and Andy is excited to participate. He decides he doesn't want any of his country's "boring" resources, so he begins trading with other countries in order to acquire all of the world's lions.

data science

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

50 Data Science Jobs That Opened Just Last Week

Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.

Data Science With Python Training | Python Data Science Course | Intellipaat

🔵 Intellipaat Data Science with Python course: https://intellipaat.com/python-for-data-science-training/In this Data Science With Python Training video, you...

Applications Of Data Science On 3D Imagery Data

The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.

Data Science Course in Dallas

Become a data analysis expert using the R programming language in this [data science](https://360digitmg.com/usa/data-science-using-python-and-r-programming-in-dallas "data science") certification training in Dallas, TX. You will master data...

32 Data Sets to Uplift your Skills in Data Science | Data Sets

Need a data set to practice with? Data Science Dojo has created an archive of 32 data sets for you to use to practice and improve your skills as a data scientist.