Luna  Mosciski

Luna Mosciski

1595939340

How Nebula Graph Stores a One Trillion Connections Social Network

WeChat is one of the social network apps in the world that deals with large scale heterogeneous graphs. The dataset to be processed has:

  • One trillion edges/connections
  • A total dataset of 150TB
  • An hourly update of 100 billion connections,

And it is a huge challenge. The team at WeChat encountered problems when using Nebula Graph, an open source distributed graph database.

However, through deep customization capabilities in the database, the team has realized some useful on-demand features. They include big data storage, data import for large data sets with a fast performance, version control, rollback at the second level, and access to the database at millisecond level.

The Challenges Facing Large Internet Companies

Most well-known graph databases are not capable of dealing with truly big data. For example, the community version of Neo4j provides single-host service and is widely adopted in the knowledge graph area. However, when it comes to a very large data set this solution misses the mark. And large data sets are increasingly common in today’s business world.

Plus, there are issues like data consistency and disaster recovery to consider if you choose a multi-copy implementation. Janus Graph has solved the big data storage problem by using external metadata management, kv storage and indexes. Yet the performance has been widely criticized. As a result, most graph database solutions that the WeChat team evaluated are many times better than Janus Graph in terms of performance.

Some Internet companies build their own databases. These self-developed solutions are catering to their own business requirements, rather than for general graph scenarios. So, they support only a limited proportion of query syntaxes.

GeaBase From Ant Financial

GeaBase is another option, mainly used in the finance industry. It features a self-developed query language, pushdown computation and millisecond latency. The main scenarios for its usage include risk management in financial organizations. To this end, it supports a transaction network with trillions of edges/relationships, storing real-time transaction data, real-time fraud detection.

It is also useful for recommendation engines. This includes applications like stocks and securities recommendations. Its Ant Forest features the capability to store trillions of nodes, strong data consistency, and low latency querying. It also has a GNN feature for Dynamic Graph CNN, for online inference based on dynamic graphs.

iGraph From Alibaba

There is also iGraph, a graph indexing and query system. It stores user behavior information and serves as one of the four backbone middle platforms in Alibaba. iGraph has adopted Gremlin as its graph query language for real-time queries of e-commerce relationships.

ByteGraph From ByteDance (a.k.a TikTok)

By adding a cache layer to the kv layer, ByteGraph splits the relationships into B+ trees for efficient access to edges and data sampling. The structure is like the TAO of Facebook.

Architecture of the WeChat Big Data Solution

The WeChat team has come up with the following architecture to solve the big data storage and processing problem.

Why Nebula Graph?

As seen in the architecture above, a graph database is the main component of the solution. WeChat ended up selecting Nebula Graph as the starting point of its journey in exploring graph databases.

WeChat found Nebula Graph had the most potential for handling huge dataset storage needs based on the capability of dataset partitioning and an independent relationship storage. It also had pushdown computation and MPP optimization based on the strong consistency storage engine. Finally, the team had extensive experience in the graph database field and a proven model for abstraction for big data.

Problems in Practice Nebula Graph

Insufficient Memory

The WeChat team encountered memory issues. At its essence, it was a problem of performance versus resources. Memory occupation is an un-neglectable issue in an application dealing with large scale datasets. There are a couple of components in RocksDB that contribute to memory usage. There are Block cache, Indexes and bloom filters. There are also Memtables and Blocks pinned by iterators. So, the WeChat team moved to optimize memory utilization. It began with block cache optimization. To do this, it adopted a global LRU cache to control the cache occupation of all RocksDB instances in a machine.

Then the team did a bloom filter optimization. An edge is designed as a key-value pair and stored in RocksDB. If all keys are stored in a bloom filter and each key occupies 10bit, then the memory required by the entire filter will exceed the machine memory by a large margin.

The team observed that most of the time the requests are to acquire a list of edges for a specific node. Therefore, the team adopted a prefix bloom filter. Another optimization was made to create indexes for properties on vertices, which enables acceleration for most requests. Finally, the memory occupation of a single-host filter is at the gigabyte level without sacrificing the speed of most requests.

#database #graph database #case study #use cases #wechat #nebula graph

What is GEEK

Buddha Community

How Nebula Graph Stores a One Trillion Connections Social Network
Jones Brianna

Jones Brianna

1622897028

How To Create A Social Network Platform Like Facebook & Instagram

Social networking app development can be entertaining and challenging and you can be guaranteed a successful social media app development if a methodical passageway to the invention is followed. Not sure where to start? Implement 7 Easy Steps to Create A Social Media App

#social network app development services #social network application development services #social network app development company #social network app development #social media app development company #social media app development

Luna  Mosciski

Luna Mosciski

1595939340

How Nebula Graph Stores a One Trillion Connections Social Network

WeChat is one of the social network apps in the world that deals with large scale heterogeneous graphs. The dataset to be processed has:

  • One trillion edges/connections
  • A total dataset of 150TB
  • An hourly update of 100 billion connections,

And it is a huge challenge. The team at WeChat encountered problems when using Nebula Graph, an open source distributed graph database.

However, through deep customization capabilities in the database, the team has realized some useful on-demand features. They include big data storage, data import for large data sets with a fast performance, version control, rollback at the second level, and access to the database at millisecond level.

The Challenges Facing Large Internet Companies

Most well-known graph databases are not capable of dealing with truly big data. For example, the community version of Neo4j provides single-host service and is widely adopted in the knowledge graph area. However, when it comes to a very large data set this solution misses the mark. And large data sets are increasingly common in today’s business world.

Plus, there are issues like data consistency and disaster recovery to consider if you choose a multi-copy implementation. Janus Graph has solved the big data storage problem by using external metadata management, kv storage and indexes. Yet the performance has been widely criticized. As a result, most graph database solutions that the WeChat team evaluated are many times better than Janus Graph in terms of performance.

Some Internet companies build their own databases. These self-developed solutions are catering to their own business requirements, rather than for general graph scenarios. So, they support only a limited proportion of query syntaxes.

GeaBase From Ant Financial

GeaBase is another option, mainly used in the finance industry. It features a self-developed query language, pushdown computation and millisecond latency. The main scenarios for its usage include risk management in financial organizations. To this end, it supports a transaction network with trillions of edges/relationships, storing real-time transaction data, real-time fraud detection.

It is also useful for recommendation engines. This includes applications like stocks and securities recommendations. Its Ant Forest features the capability to store trillions of nodes, strong data consistency, and low latency querying. It also has a GNN feature for Dynamic Graph CNN, for online inference based on dynamic graphs.

iGraph From Alibaba

There is also iGraph, a graph indexing and query system. It stores user behavior information and serves as one of the four backbone middle platforms in Alibaba. iGraph has adopted Gremlin as its graph query language for real-time queries of e-commerce relationships.

ByteGraph From ByteDance (a.k.a TikTok)

By adding a cache layer to the kv layer, ByteGraph splits the relationships into B+ trees for efficient access to edges and data sampling. The structure is like the TAO of Facebook.

Architecture of the WeChat Big Data Solution

The WeChat team has come up with the following architecture to solve the big data storage and processing problem.

Why Nebula Graph?

As seen in the architecture above, a graph database is the main component of the solution. WeChat ended up selecting Nebula Graph as the starting point of its journey in exploring graph databases.

WeChat found Nebula Graph had the most potential for handling huge dataset storage needs based on the capability of dataset partitioning and an independent relationship storage. It also had pushdown computation and MPP optimization based on the strong consistency storage engine. Finally, the team had extensive experience in the graph database field and a proven model for abstraction for big data.

Problems in Practice Nebula Graph

Insufficient Memory

The WeChat team encountered memory issues. At its essence, it was a problem of performance versus resources. Memory occupation is an un-neglectable issue in an application dealing with large scale datasets. There are a couple of components in RocksDB that contribute to memory usage. There are Block cache, Indexes and bloom filters. There are also Memtables and Blocks pinned by iterators. So, the WeChat team moved to optimize memory utilization. It began with block cache optimization. To do this, it adopted a global LRU cache to control the cache occupation of all RocksDB instances in a machine.

Then the team did a bloom filter optimization. An edge is designed as a key-value pair and stored in RocksDB. If all keys are stored in a bloom filter and each key occupies 10bit, then the memory required by the entire filter will exceed the machine memory by a large margin.

The team observed that most of the time the requests are to acquire a list of edges for a specific node. Therefore, the team adopted a prefix bloom filter. Another optimization was made to create indexes for properties on vertices, which enables acceleration for most requests. Finally, the memory occupation of a single-host filter is at the gigabyte level without sacrificing the speed of most requests.

#database #graph database #case study #use cases #wechat #nebula graph

Jones Brianna

Jones Brianna

1608007454

Top 5 Social Network App Development Companies

https://yourstory.com/mystory/top-5-social-network-app-development-companies

Social media has become a habitual thing for everyone in this world and also it has become an integral part of everyone’s life. In this article, we are sharing a curated list of Top 5 Social Network App Development Companies that have extensive and years of experience in developing custom social network solutions including online dating, corporate network, photo sharing, and social community apps development.

#social network app development services #social network application development services #social network app development company #social network app development #social media app development company

Stratus seo

Stratus seo

1625816471

Stratus: One of the best social media posting tools for efficient social media management

Efficient social media management could mean you getting the desired online recognition and leads for your business (if that was your intend to stay active on social media). Unfortunately, the common practice of social media management requires you to switch between multiple accounts of yours. This requires significant time and effort on your part. Stratus addresses this problem by bringing all of the social media channels on a single platform. You can access and manage your social media accounts in a single place while saving your time and effort. The user-friendly interface and advanced features integrated into the Stratus platform make it one of the best social media posting tools. To learn more or to sign up on Stratus, visit https://stratus.co/

#best social media posting tools #social media management #manage social media accounts in one place #best social media management tools #manage all social media in one place #social media management tools

Jones Brianna

Jones Brianna

1600231587

Top 5 Social Media App Development Strategies 2020 - Mobiweb

Creating a social media app is not an easy task but with great ideas, experience and a proficient team of social media makers can make the project successful. A social media app development company will help an entrepreneur to reach his business goals and bring in true and loyal users.

#social network app development services #social network application development services #social network app development company #social media app development company #social media app development #social media platform development