How to Get Started with Neo4j

A graph in its simplest form is a collection of nodes and relationships. A graph database is a database management system that uses the graph data model (nodes and relationships) to perform create, read, update, and delete (CRUD) operations. Graph databases are designed to treat the relationship between nodes as first-class citizens. This means connections between data would not need to be inferred using foreign keys.

Sample Graph

Graph database concepts

Nodes

A node is an entity (such as a person, place, object, or relevant piece of data) in a graph. The simplest possible graph is a single node.

Graph Node

Labels

Labels are used to group nodes into sets such that all nodes that are tagged with a certain label belonging to the same set. Node labels may also be used to attach metadata (such as index or constraint) to certain nodes. In our example graph above, all nodes representing persons are labeled with :Person.

Relationships

A relationship is a connection between two nodes. A relationship always has a direction, a type, a start node, and an end node.

Our example graph has ACTED_IN, HAS_CONTACT, and DIRECTED as relationship types. The Chris Evans node has an outgoing relationship, while the “Knives Out” node has an incoming relationship.

Properties

Properties are key-value pairs that are used to add attributes to nodes and relationships.

In our example graph, we used the properties name and born on Person nodes, title and released on Movie nodes, and the property roles on the :ACTED_IN relationship.

Property values can be any of the following data types:

Integer
Float
String
Boolean
Point
Date
Time
LocalTime
DateTime
LocalDateTime, and
Duration

Why use a graph database?

We live in a world that is highly connected. Today, companies manage large, interconnected data sets. The best way to leverage data relationships is to use a technology that places great importance on relationships. This is exactly what a graph database does. A graph database stores relationship information as a first-class entity.

Because graph databases do not follow rigid schemas, they are best suited for today’s agile teams where business requirements change rapidly. With a graph database, you have the flexibility to expand your database to conform to changing business needs.

Graph databases have been designed to support efficient data retrieval, allowing you to traverse millions of connections in real time.

Graph databases systems

There are so many graph databases. The table below shows the top graph databases (source: DB-Engines).

Graph Databases

As you can see, Neo4j is the most popular graph database system. In this tutorial, we’ll walk you through how to use Neo4j database.

What is Neo4j?

Neo4j is an open-source, NoSQL, native graph database that provides an ACID-compliant transactional backend for your applications.

Neo4j is said to be a native graph database because it efficiently implements the property graph model down to the storage level. It also provides full database characteristics, such as ACID transaction compliance, cluster support, and runtime failover. Neo4j supports its own query language called Cypher.

Installing Neo4j

There is a variety of ways to interact with and use graph data in Neo4j. For the purpose of this tutorial, we’ll use Neo4j Desktop.

Neo4j Desktop has support for Cypher by default and does not require a separate driver installation. Download Neo4j Desktop for your operating system and then follow the installation instructions.

Cypher query language

Cypher is Neo4j’s graph query language. It allows users to store and retrieve data from the graph database.

Neo4j’s Cypher querying language is easy for anyone to learn, understand, and use. Cypher incorporates the power and functionality of other standard data access languages.

Querying nodes and relationships with Cypher

Before we explore how to query a Neo4j graph database, let’s create a new database and populate it with data.

Open your installed Neo4j desktop app and create a new database called learn-neo4j. Open the new database in the Neo4j browser and run the query below to populate the database with initial data.

// post data
CREATE (johnnyMnemonic:Movie {title:"Johnny Mnemonic",tagline:"The hottest data on earth. In the coolest head in town",released:1995} )
CREATE (sleepless:Movie {title:"Sleepless in Seattle",tagline:"What if someone you never met, someone you never saw, someone you never knew was the only someone for you?",released:1993})
CREATE (dreams:Movie {title:"What Dreams May Come", tagline:"After life there is more. The end is just the beginning.",released:1998}  )
CREATE (dina:Person {name:"Dina Meyer", born:1968} )
CREATE(ice:Person {name:"Ice-T", born:1958})
CREATE(keenu:Person {name:"Keanu Reeves", born:1964})
CREATE(takeshi:Person {name:"Takeshi Kitano", born:1947})
CREATE (robert:Person {name:"Robert Longo", born:1953})
CREATE (meg:Person {name:"Meg Ryan", born:1961} )
CREATE (cuba:Person {name:"Cuba Gooding Jr.", born:1968} )
CREATE (vin:Person {name: "Vincent Ward", born:1956})
CREATE (dina)-[:ACTED_IN { roles: ["Jane"]}]->(johnnyMnemonic)
CREATE (ice)-[:ACTED_IN { roles: ["J-Bone"]}]->(johnnyMnemonic)
CREATE (keenu)-[:ACTED_IN { roles: ["Johnny Mnemonic"]}]->(johnnyMnemonic)
CREATE (takeshi)-[:ACTED_IN { roles: ["Takahashi"]}]->(johnnyMnemonic)
CREATE (meg)-[:ACTED_IN {roles:["Annie Reed"]} ]->(sleepless)
CREATE (robert)-[:DIRECTED]->(johnnyMnemonic)
CREATE (cuba)-[:ACTED_IN]->(dreams)
CREATE (cuba)-[:HAS_CONTACT]->(vin)
CREATE (vin)-[:DIRECTED]->(dreams)
CREATE (cuba)-[:HAS_CONTACT]->(meg)
CREATE (meg)-[:HAS_CONTACT]->(dina)
CREATE (robert)-[:HAS_CONTACT]->(meg)
CREATE (robert)-[:HAS_CONTACT]->(vin)
CREATE (robert)-[:HAS_CONTACT]->(cuba)

Matching Nodes

To retrieve a node in a Neo4j graph, we use the MATCH statement. A MATCH statement will search for the patterns we specify and return one row per pattern successfully matched.

You can find all nodes that exist in a graph.

MATCH (n)
RETURN n

n is a variable that represents all matched nodes. In this case, it’s all nodes in our graph. Here’s the result:

All nodes in our graph

We can limit our query to search for specific nodes by adding the label of the node.

MATCH (n:Person)
RETURN n
╒═══════════════════════════════════════╕
│"n"                                    │
╞═══════════════════════════════════════╡
│{"name":"Dina Meyer","born":1968}      │
├───────────────────────────────────────┤
│{"name":"Robert Longo","born":1953}    │
├───────────────────────────────────────┤
│{"name":"Meg Ryan","born":1961}        │
├───────────────────────────────────────┤
│{"name":"Cuba Gooding Jr.","born":1968}│
├───────────────────────────────────────┤
│{"name":"Vincent Ward","born":1956}    │
└───────────────────────────────────────┘

The query returned only nodes labeled Person.

#neo4j #database #developer #nosql