5 Pitfalls of NoSQL Databases

5 Pitfalls of NoSQL Databases

I recorded a video in which I talk about the advantages of NoSQL databases. The response was interesting, but I had the impression that not everyone sees the two sides of the coin. The facts are that they can cause us a lot of problems 😉.

I recorded a video in which I talk about the advantages of NoSQL databases. The response was interesting, but I had the impression that not everyone sees the two sides of the coin. The facts are that they can cause us a lot of problems 😉.

Schema Management

Each NoSQL database approaches the schema in its own way. In some there is no schema (MongoDB), in some, it is dynamic (Elasticsearch), and in some it resembles the one from relational databases (Cassandra). In the conceptual model, data ALWAYS have a pattern. Entities, fields, names, types, relations. Regardless of the type of base, the physical model is a representation of the conceptual model.

NoSQL databases give us more freedom in terms of schema. In MongoDB, we can add two different documents with the same field names but different types. Does this make sense? Rather not. Can this happen? Of course, it can. A simple human error can break our app.

Another issue is related to relationships between entities. Even if there are no relations in our db, we have to document the relationships between the data. From the relational database we can generate an ERD diagram. In case of NoSQL databases this may not work.

When using NoSQL databases we have to remember about schema management and data validation issues. Without it the data can “explode”. Interesting fact: Some companies replace MongoDB with PostgreSQL.

Lower margin of error

Performance of NoSQL databases is the result of proper data modeling, indexing and partitioning. In a relational database we can add columns, transform tables, flip data from one table to another, add an index if we have forgotten about it before. In case of NoSQL databases, this will not be possible in all cases. We may need to use some external tools like Apache Spark or even drop and recreate our data model.

In Elasticsearch, if we don’t get the schema/mapping of an index, we have to use e.g. Reindex API, which means that we have to re-index data to another index.

In Cassandra, we can only filter by partition and clustering keys. If we forgot to add one of the columns to the key, there is a possibility of adding an index, but this can kill performance if the cardinality of the set is large.

nosql data-engineering sql data database

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Introduction to Structured Query Language SQL pdf

SQL stands for Structured Query Language. SQL is a scripting language expected to store, control, and inquiry information put away in social databases. The main manifestation of SQL showed up in 1974, when a gathering in IBM built up the principal model of a social database. The primary business social database was discharged by Relational Software later turning out to be Oracle.

Managing Data as a Data Engineer:  Understanding Data Changes

Understand how data changes in a fast growing company makes working with data challenging. In the last article, we looked at how users view data and the challenges they face while using data.

Managing Data as a Data Engineer — Understanding Users

Understanding how users view data and their pain points when using data. In this article, I would like to share some of the things that I have learnt while managing terabytes of data in a fintech company.

What is NoSQL and How is it Utilized?

NoSQL databases use a variety of data models for accessing and managing data. These types of databases are optimized specifically for applications that require large data volume, low latency, and flexible data models, which are achieved by relaxing some of the data consistency restrictions of other databases.

Data Observability: How to Prevent Broken Data Pipelines

Data Observability: How to Prevent Broken Data Pipelines. The relationship between data downtime, observability, and reliable insights