The deep learning and knowledge graph technologies have been developing rapidly in recent years. Find out more about graph databases.
The deep learning and knowledge graph technologies have been developing rapidly in recent years. Compared with the "black box" of deep learning, knowledge graphs are highly interpretable, thus are widely adopted in such scenarios as search recommendations, intelligent customer support, and financial risk management.
Meituan has been digging deep in the connections buried in the huge amount of business data over the past few years and has gradually developed the knowledge graphs in nearly ten areas, including cuisine graphs, tourism graphs, and commodity graphs. The ultimate goal is to enhance the smart local life.
Compared with the traditional RDBMS, graph databases can store and query knowledge graphs more efficiently. It gains obvious performance advantage in multi-hop queries to select graph databases as the storage engine. Currently, there are dozens of graph database solutions out there on the market.
It is imperative for the Meituan team to select a graph database solution that can meet the business requirements and to use the solution as the basis of Meituan's graph storage and graph learning platform. The team has outlined the basic requirements as below per our business status quo:
By having control over the source code, the Meituan team can ensure data security and service availability.
The knowledge graph data size in Meituan can reach hundreds of billions of vertices and edges in total and the throughput can reach tens of thousands of QPS. With that being said, the single-node deployment cannot meet Meituan's storage requirements.
To ensure the best search experience for Meituan users, the team has strictly restricted the timeout value within all chains of paths. Therefore, it is unacceptable to respond to a query at the second level.
The knowledge graph data is usually stored in data warehouses like Hive. The graph database should be equipped with the capability to quickly import data from such warehouses to the graph storage to ensure service effectiveness.
The Meituan team has tried the top 30 graph databases on DB-Engines and found that most well-known graph databases only support single-node deployment with their open-source edition, for example, Neo4j, ArangoDB, Virtuoso, TigerGraph, RedisGraph. This means that the storage service cannot scale horizontally and the requirement to store large-scale knowledge graph data cannot be met.
After thorough research and comparison, the team has selected the following graph databases for the final round: Nebula Graph (developed by a startup team who originally came from Alibaba), Dgraph (developed by a startup team who originally came from Google), and HugeGraph (developed by Baidu).
In this article, take a look at data migration from JanusGraph to Nebula Graph. Speaking of graph data processing, we have had experience in using various graph databases. In the beginning, we used the stand-alone edition of AgensGraph. Later, due to its performance limitations, we switched to JanusGraph, a distributed graph database.
In this article, take a look at how indexes work in Nebula Graph and see core concepts to understand them.
In this article, take a look at the Nebula Graph source code and see a sample graph query. When I saw the Nebula Graph code repository for the first time, I was so shocked by its huge size that I didn’t know how to dig into the source code.
In this article, take a look at key configurations in Nebula graph, an intro into the dataset, and more.
In this article, see part one of how to analyze relationships in Game of Thrones with NetworksX, Gephi, and Nebula Graph.