Network data is everywhere and it is becoming increasingly important for data scientists to have a working knowledge of graph analytics. One challenge that data scientists often face is the lack of scalability of graph analytics solutions. In this blog, I discuss how you can use py2neo combined with neo4j to build a scalable graph analytics solution from scratch.

Many libraries in python have been created to perform graph analytics. The most popular ones are networkxscikit-networks, and graph-tool. All of these packages are great; however, if you are working with large amounts of data, you might want to consider using the power of neo4j.

Neo4j is a graph database which means that it is designed specifically for the storage and analysis of large graph datasets. Think about transactional databases of supermarkets or network data from social media platforms. The neo4j community edition is a free version of neo4j that can be downloaded by anyone.

**Py2neo **is a python package that allows the programmer to use the power of neo4j in python. It works by establishing a connection to neo4j which allows the programmer to execute queries on the neo4j database and write the results to a pandas dataframe (or other data types). Unfortunately, the documentation of Py2neo is not perfect (see this thread). Below, I have outlined all the steps that need to be taken to start using py2neo for your next graph analytics project.y2neo


Making the py2neo connection to neo4j work will probably be the hardest part of your graph analysis project. With the steps below, however; you can get started in less than 10 minutes!

**Step 1: **download the neo4j community edition

#analytics #social-network #data-science #python #graph-analytics #neural networks

Graph Analytics with py2neo
2.60 GEEK