1620663480
At the very beginning of most development endeavors lies an important question: What database do I choose? There is such an abundance of database technologies at this moment, it’s no wonder many developers don’t have the time or energy to research new ones. If you are one of those developers and you aren’t very familiar with graph databases in general, you’ve come to the right place!
In this article, you will learn about the main differences between a graph database and a relational database, what kind of use-cases are best suited for each database type, and what are their strengths and weaknesses.
#graph-database #relational-database #graph-theory #graph-analysis #data-analytics #networks #data #database
1620663480
At the very beginning of most development endeavors lies an important question: What database do I choose? There is such an abundance of database technologies at this moment, it’s no wonder many developers don’t have the time or energy to research new ones. If you are one of those developers and you aren’t very familiar with graph databases in general, you’ve come to the right place!
In this article, you will learn about the main differences between a graph database and a relational database, what kind of use-cases are best suited for each database type, and what are their strengths and weaknesses.
#graph-database #relational-database #graph-theory #graph-analysis #data-analytics #networks #data #database
1603659600
The deep learning and knowledge graph technologies have been developing rapidly in recent years. Compared with the “black box” of deep learning, knowledge graphs are highly interpretable, thus are widely adopted in such scenarios as search recommendations, intelligent customer support, and financial risk management.
Meituan has been digging deep in the connections buried in the huge amount of business data over the past few years and has gradually developed the knowledge graphs in nearly ten areas, including cuisine graphs, tourism graphs, and commodity graphs. The ultimate goal is to enhance the smart local life.
Compared with the traditional RDBMS, graph databases can store and query knowledge graphs more efficiently. It gains obvious performance advantage in multi-hop queries to select graph databases as the storage engine. Currently, there are dozens of graph database solutions out there on the market.
It is imperative for the Meituan team to select a graph database solution that can meet the business requirements and to use the solution as the basis of Meituan’s graph storage and graph learning platform. The team has outlined the basic requirements as below per our business status quo:
By having control over the source code, the Meituan team can ensure data security and service availability.
The knowledge graph data size in Meituan can reach hundreds of billions of vertices and edges in total and the throughput can reach tens of thousands of QPS. With that being said, the single-node deployment cannot meet Meituan’s storage requirements.
To ensure the best search experience for Meituan users, the team has strictly restricted the timeout value within all chains of paths. Therefore, it is unacceptable to respond to a query at the second level.
The knowledge graph data is usually stored in data warehouses like Hive. The graph database should be equipped with the capability to quickly import data from such warehouses to the graph storage to ensure service effectiveness.
The Meituan team has tried the top 30 graph databases on DB-Engines and found that most well-known graph databases only support single-node deployment with their open-source edition, for example, Neo4j, ArangoDB, Virtuoso, TigerGraph, RedisGraph. This means that the storage service cannot scale horizontally and the requirement to store large-scale knowledge graph data cannot be met.
After thorough research and comparison, the team has selected the following graph databases for the final round: Nebula Graph (developed by a startup team who originally came from Alibaba), Dgraph (developed by a startup team who originally came from Google), and HugeGraph (developed by Baidu).
#database #tutorial #graph database #database performance #nebula graph #dgraph #graph database adoption
1620713400
SQLite like you have never seen it before
FactEngine (www.factengine.ai) is an initiative to radically change the way people look at databases. The essence of the initiative is to reveal how all databases can be viewed as multi-model databases (graph or relational). As the first of its kind it is hard to talk of the science without referencing the initiative. But let’s get to the science…
Dedicated Graph databases are somewhat famous now for working under/over a property graph schema, which looks like the following:
The schema above is what is known as a directed graph schema, where a relationship such as Lecturer is in School is pictorially shown an arrow-directed **_edge (or graph) _**connecting the nodes Lecturer and School.
Manufacturers of dedicated graph databases would have you believe this type of modelling and associated graph query languages are the purview of those dedicated graph databases. This is only true if you want it to be and don’t have the tools to visualise your database as a graph database or a relational database.
Relational databases are traditionally bound to a schema pictorially represented as an Entity-Relationship Diagram, as below:
What you sometimes want is to be able to query a relational database as if it were a graph database to make life easy.
Let us say we want to view everyone that a fictional lecturer likes using our extant schemas. We should be able to query the database with a simple graph query as:
Our query, (Lecturer:’Alexandria’,’Archer’) likes WHICH Lecturer , returns one result, Steven Hollows.
#multi-model-database #graph-database #object-role-modeling #relational-databases #recursive-graph-queries
1600290000
Much has been written about graph databases as a distinct type of database in the last few years and it is easy to forgive in believing the accompanying marketing material warranting differentiation of graph databases from any other database.
So what does that marketing material say and how much should you give credit to?
Graph databases are described as databases that operate over graphs and where relationships between things matter. A graph is a type of structure and the underlying graph of a graph database maps the structure, or schema, of the data stored in the database.
The picture below is a graph model for a seat booking database solution for a cinema. We would use such a schema to book seats to watch a film in a particular session at that cinema. I believe it can be readily said that graph schemas are straight forward to look at.
#relational-model #er-diagram #graph-database #relational-databases #property-graph-schema
1599897600
Speaking of graph data processing, we have had experience in using various graph databases. In the beginning, we used the stand-alone edition of AgensGraph. Later, due to its performance limitations, we switched to JanusGraph, a distributed graph database. I introduced details on how to migrate data in my article “Migrate tens of billions of graph data into JanusGraph (only in Chinese)”. As the data size and the number of business calls grew, a new problem appeared: Each query consumed too much time. In some business scenarios, a single query took up to 10 seconds, and with increase of the data size, a more complicated single query needed two or three seconds. These problems had seriously affected the performance of the entire business process and the development of related businesses.
The architecture design of JanusGraph determines that a single query is time-consuming. The core reason is that its storage depends on the external storage, and JanusGraph cannot control the external storage well. In our production environment, an HBase cluster is used, which makes it impossible for all queries to be pushed down to the storage layer for processing. Instead, data can only be queried from HBase to the JanusGraph Server memory and then filtered accordingly.
#database #tutorial #graph database #database performance #nebula graph #graph database adoption