Jaeger Persistent Storage with Elasticsearch, Cassandra and Kafka

Jaeger Persistent Storage with Elasticsearch, Cassandra and Kafka

Running systems in production involve requirements for high availability, resilience and recovery from failure. When running cloud-native applications this becomes even more critical, as the base assumption in such environments is that compute nodes will suffer outages, Kubernetes nodes will go down and microservices instances are likely to fail, yet the service is expected to remain up and running.

Running systems in production involve requirements for high availability, resilience and recovery from failure. When running cloud-native applications this becomes even more critical, as the base assumption in such environments is that compute nodes will suffer outages, Kubernetes nodes will go down and microservices instances are likely to fail, yet the service is expected to remain up and running.

In a recent post, I presented the different Jaeger components and best practices for deploying Jaeger in production. In that post, I mentioned that Jaeger uses external services for ingesting and persisting the span data, such as Elasticsearch, Cassandra and Kafka. This is due to the fact that the Jaeger Collector is a stateless service and you need to point it to some sort of storage to which it will forward the span data.

In this post, I’d like to discuss how to ingest and persist Jaeger trace data in production to ensure resilience and high availability, and the external services you need to set up for that. I’ll cover:

  • Standard persistent storage for Jaeger with Elasticsearch and Cassandra
  • Alternative persistent storage with gRPC plugin
  • Handling high load tracing data streams with Kafka
  • Jaeger persistence during development with jaegertracing all-in-one

Deploying Jaeger with Elasticsearch, Kafka or other External Services

Jaeger deployments may involve additional services such as Elasticsearch, Cassandra and Kafka. But do these services come as part of Jaeger’s installation and how are these services deployed?

The Jaeger Operator and Jaeger’s Helm chart (see Jaeger’s deployment tools on this post) offer the option of a self-provisioned Elasticsearch/Cassandra/Kafka cluster (in which Jaeger deployment also deploys these clusters), as well as the option of connecting to an existing cluster.

The self-provisioned option offers a good starting point, but you may prefer to deploy these services independently for better flexibility and control over the way these clusters are deployed, managed, monitored, upgraded and secured, in accordance with your team’s DevOps practices.

jaeger elasticsearch kafka cassandra observability database database-storage distributed-tracing-system

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

How to Choose the Right Database for your Requirements

Imagine — You’re in a system design interview and need to pick a database to store, let’s say, order-related data in an e-commerce system. Your data is structured and needs to be consistent, but your query pattern doesn’t match with a standard relational DB’s. You need your transactions to be isolated, and atomic and all things ACID… But OMG it needs to scale infinitely like Cassandra!! So how would you decide what storage solution to choose? Well, let’s see!

Building a distributed multi-video processing pipeline with Python and Confluent Kafka

Large Scale Data Processing and Machine Learning in Real Time. In this article we will explore kafka by doing just that.

Distributed SQL: An Evolution of the Database

The next step in the evolution of database architecture is distributed SQL. Take a look at some of the characteristics here.As organizations transition to the cloud, they eventually find that the legacy relational databases that are behind some of their most critical applications simply do not take advantage of the promise of the cloud and are difficult to scale.

Contact Tracing App: The Technology, Approach to fight COVID-19/Corona

A detailed article on how Contact Tracing Apps can stop the spread of COVID-19/Corona virus. It talks about how it works, benefits, scope, examples and more

Apache Cassandra - Apache Cassandra is a high-performance

Apache Cassandra is a high-performance, extremely scalable, fault tolerant (i.e., no single point of failure), distributed non-relational database solution. Cassandra combines all the benefits of Google Bigtable and Amazon Dynamo to handle the types of database management needs that traditional RDBMS vendors cannot support.