TigerGraph can be combined with Docker to run on nearly any OS. In this article, we tackle the issue of the large TigerGraph image.

TigerGraph is my graph database and graph analytics platform of choice as it is fast, scalable, and has an active open-source community. I regularly make use of TigerGraph locally due to my location not having nearby TigerGraph Cloud servers.

At the time of writing, the TigerGraph software requirements specify support for the following operating systems:

  • Redhat and Centos versions 6.5–6.9, 7.0–7.4, and 8.0–8.2
  • Ubuntu 14.04, 16.04, and 18.04
  • Debian 8

For anyone using operating systems beyond this list, a logical solution would be to make use of containerization: Docker, in the case of this article.

In this article we will cover:

  • How to make use of the official TigerGraph and what’s inside
  • Stripping the official Docker image of unnecessary bloat
  • Using Docker Compose to run TigerGraph images

The Official TigerGraph Image

The official TigerGraph image, running the developer edition, can be obtained by the following command:

docker pull docker.tigergraph.com/tigergraph-dev:latest

Run with:

docker run -d -p 14022:22 -p 9000:9000 -p 14240:14240 --name tigergraph_dev --ulimit nofile=1000000:1000000 -v ~/data:/home/tigergraph/mydata -t docker.tigergraph.com/tigergraph-dev:latest

This source gives more in-depth instructions on how the image is constructed, but in summary:

  • A base image of Ubuntu 16.04 is used
  • All required software such as tar, curl, etc. are installed
  • Optional software such as emacs, vim, wget, etc. are installed
  • GSQL 101 and 102 tutorials and the GSQL Algorithms library is downloaded
  • An SSH server, REST++ API, and GraphStudio are the 3 notable ports which can be exposed and used to communicate with the server

The total image is close to a 1.8–2.0GB download (version dependent) which puts considerable strain on bandwidth — especially with resource-sensitive use cases like CI/CD. Another notable point is that all one needs to make use of TigerGraph is a GSQL socket connection which can be interfaced with by tools such as Giraffle and pyTigerGraph.

I’ve identified two large sources of bloat which are:

  • The optional and unnecessary software e.g. vim and GSQL Tutorial 101
  • GraphStudio and binaries not necessary for the minimal operation of TigerGraph Developer Edition

#docker-compose #graph #tigergraph #docker

Efficient Use of TigerGraph and Docker
6.10 GEEK