A step by step guide for developing a Java based Kafka client in a Node.js application using GraalVM.

The first time I heard about GraalVM, it totally blew my mind. Being able to combine multiple languages in a single application or business logic is an incredibly useful and powerful tool.

A real life need for a polyglot application emerged once we decided to switch from RabbitMQ to Kafka as our messaging system. Most of our RMQ consumers were written in Node.js, and moving to a different messaging system would force us either use a Node.js based library, or rewrite our entire business logic.

While there are several Node.js based Kafka clients, using them poses limitations such as the implemented Kafka API version, or the exposed interfaces and customization options. Using a Native Kafka client while maintaining the Node.js business logic would be a real win for us.

This tutorial builds on this awesome medium post on developing with Java and JavaScript together using GraalVM.

We will be using Docker Compose to build and create our images.

A working example can be found here.

Setting up Docker

The minimal needs of our environment are having GraalVM, Zookeeper and Kafka installed. The quickest way to achieve this is by using Docker and Docker Compose to create a complete running environment:

version: '3.3'
services:
  zookeeper:
    image: 'confluentinc/cp-zookeeper:5.0.0'
    hostname: zookeeper
    ports:
      - '2181:2181'
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000
    volumes:
      - zk-data:/var/lib/zookeeper/data
      - zk-log:/var/lib/zookeeper/log
  kafka-broker:
    image: 'confluentinc/cp-kafka:5.0.0'
    ports:
      - '9092:9092'
      - '9093:9093'
    depends_on:
      - 'zookeeper'
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT2://kafka-broker:9093
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT, PLAINTEXT2:PLAINTEXT
      KAFKA_TOPICS: "test_topic"
  graalvm:
    image: 'oracle/graalvm-ce:1.0.0-rc12'
    depends_on:
      - 'kafka-broker'
    volumes:
      - ./:/code
    environment:
      VM: 'graalvm'
      

volumes:
  zk-data:
  zk-log:

docker-compose.yml hosted with ❤ by GitHub

A Docker Compose file containing definitions for zookeeper, Kafka and GraalVM.

Running docker-compose up -d from the containing folder will perform the following:

  1. Download a Zookeeper image and run it on port 2181 along with persistent data and log volumes.
  2. Download a Kafka image containing a Kafka broker, and run it. The broker will connect to a Zookeeper on port 2181, and will allow client connections on ports 9092 and 9093.
  3. Download a GraalVM image. This image will have GraalVM and Node.js installed, and will have a shared volume with the host machine in the ./code folder.

All defined ports will be exposed on the local machine (localhost:port). Also, services will recognize each other based on their server name. Accessing Zookeeper from the broker machine will be using zookeeper:2181 as the host name. Same for kafka-broker:9092 for connecting with the Kafka broker.

Setting up the Java client

We are going to be using Java 1.8 and Maven to compile and run our Java client.

Even though the entire Kafka client will reside in a container, it will be helpful to run and debug our code directly from the host machine, using our favorite IDE. To do that, Maven and Java need to be installed on the host machine. Connection to other containers will be done using localhost as the host name.

Setting up Maven

You can use this tutorial to start a new Maven based Java project, or just use the following pom file:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>your.group.id</groupId>
  <artifactId>kafka-client</artifactId>
  <version>1.0</version>

  <name>kafka-client</name>
  <!-- FIXME change it to the project's website -->
  <url>http://www.example.com</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients -->
    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka-clients</artifactId>
      <version>2.1.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.slf4j/slf4j-simple -->
    <dependency>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-simple</artifactId>
      <version>1.7.25</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.json/json -->
    <dependency>
      <groupId>org.json</groupId>
      <artifactId>json</artifactId>
      <version>20180813</version>
    </dependency>
  </dependencies>

  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <finalName>uber-${project.artifactId}-${project.version}</finalName>
        </configuration>
      </plugin>
    </plugins>
            
  </build>
</project>

pom.xml hosted with ❤ by GitHub

The above pom file will create the required Java application file structure, along with all the required dependencies.

Notice the ‘maven-shade-plugin’ we are using to compile a single ‘uber-jar’ for the client and all of its dependencies. This will make it easier for us to add the client to the Node.js application later.

Make sure to change your.group.id to your desired package name.

Creating a Kafka client

Next step is creating our Kafka client (consumer and producer).

We will implement a basic Kafka producer and then a consumer.

Add a Producer.java file under /src/main/java/my/group/id:

package my.package.id;

import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.StringSerializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.Iterator;
import org.json.*;
import java.util.Properties;
import java.util.concurrent.ExecutionException;

public class Producer {

    public static void main(String[] args) {
        Producer p = new Producer("{\"bootstrap.servers\": \"localhost:9092\", }");
        try {
            p.put("test_topic", "msgKey", "msgData");
        }
        catch (Exception e) {
            System.out.println("Error Putting" + e);
        }
    }

    private Properties produceProperties;
    private final KafkaProducer<String, String> mProducer;
    private final Logger mLogger = LoggerFactory.getLogger(Producer.class);

    public Producer(String config) {
        extractPropertiesFromJson(config);
        mProducer = new KafkaProducer<>(produceProperties);

        mLogger.info("Producer initialized");
    }

    public void put(String topic, String key, String value) throws ExecutionException, InterruptedException {
        mLogger.info("Put value: " + value + ", for key: " + key);

        ProducerRecord<String, String> record = new ProducerRecord<>(topic, key, value);
        mProducer.send(record, (recordMetadata, e) -> {
        if (e != null) {
            mLogger.error("Error while producing", e);
            return;
        }

        mLogger.info("Received new meta. Topic: " + recordMetadata.topic()
            + "; Partition: " + recordMetadata.partition()
            + "; Offset: " + recordMetadata.offset()
            + "; Timestamp: " + recordMetadata.timestamp());
        }).get();
    }

    void close() {
        mLogger.info("Closing producer's connection");
        mProducer.close();
    }

    private void extractPropertiesFromJson(String jsonString) {
        produceProperties = new Properties();
        JSONObject jsonObject = new JSONObject(jsonString.trim());
        Iterator<String> keys = jsonObject.keys();
        while(keys.hasNext()) {
            String key = keys.next();
            produceProperties.setProperty(key, (String)jsonObject.get(key));
        }
        String deserializer = StringSerializer.class.getName();
        produceProperties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, deserializer);
        produceProperties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, deserializer);
    }
}

Producer.java hosted with ❤ by GitHub

The producer in the example above can receive its configuration in a JSON format, and sends a string type message.

The main function in the Producer is an easy way of running the code and sending a test message.

Add a Consumer.java file in the same folder:

package my.package.id;

import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.errors.WakeupException;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.Properties;
import java.util.Collections;
import java.util.Iterator;
import java.time.Duration;
import java.util.concurrent.CountDownLatch;
import org.json.*;
import java.util.Queue; 

public class Consumer {

    // a concurrent queue shared with Node
    private final Queue<Object> mQueue;     
    private Properties consumProperties;
    private final Logger mLogger = LoggerFactory.getLogger(Consumer.class.getName());
    
    public Consumer(Queue<Object> queue, String config){
      mQueue = queue;
      extractPropertiesFromJson(config);
    }

    public void start() {
        CountDownLatch latch = new CountDownLatch(1);

        ConsumerRunnable consumerRunnable = new ConsumerRunnable(consumProperties, latch, mQueue);
        Thread thread = new Thread(consumerRunnable);
        thread.start();
    
        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            mLogger.info("Caught shutdown hook");
            consumerRunnable.shutdown();
            await(latch);

            mLogger.info("Application has exited");
        }));
    }

    private void await(CountDownLatch latch) {
        try {
          latch.await();
        } catch (InterruptedException e) {
          mLogger.error("Application got interrupted", e);
        } finally {
          mLogger.info("Application is closing");
        }
      }
    
    private void extractPropertiesFromJson(String jsonString) {
        consumProperties = new Properties();
        JSONObject jsonObject = new JSONObject(jsonString.trim());
        Iterator<String> keys = jsonObject.keys();
        while(keys.hasNext()) {
            String key = keys.next();
            consumProperties.setProperty(key, (String)jsonObject.get(key));
        }
        String deserializer = StringDeserializer.class.getName();
        consumProperties.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, deserializer);
        consumProperties.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, deserializer);
    }

    private class ConsumerRunnable implements Runnable {

        private KafkaConsumer<String, String> mConsumer;
        private CountDownLatch mLatch;
        private Queue mQueue;

        ConsumerRunnable(Properties config, CountDownLatch latch, Queue queue) {
            mLatch = latch;
            mQueue = queue;
            String topic = (String)config.get("topic");
            config.remove("topic");
            mConsumer = new KafkaConsumer<>(config);
            mConsumer.subscribe(Collections.singletonList(topic));
        }

        @Override
        public void run() {
          try {
            while (true) {
              ConsumerRecords<String, String> records = mConsumer.poll(Duration.ofMillis(100));
    
              for (ConsumerRecord<String, String> record : records) {
                mLogger.info("Key: " + record.key() + ", Value: " + record.value());
                mLogger.info("Partition: " + record.partition() + ", Offset: " + record.offset());
                mQueue.offer(record);
              }
            }
          } catch (WakeupException e) {
            mLogger.info("Received shutdown signal!");
          } finally {
            mConsumer.close();
            mLatch.countDown();
          }
        }

        public void shutdown() {
            mConsumer.wakeup();
        }
    }
}

Consumer.java hosted with ❤ by GitHub

Same as the producer, this consumer receives its configuration in a JSON format.

After configuring our consumer, we start a new thread that connects to our Kafka broker and polls for messages. Each new message is pushed into a queue which will later be used in our Node.js application.

Compiling the code

Running mvn package form within the root folder, will compile the code into a single jar file named ‘uber-kafka-client-1.0.jar’. This file contains all required java code and dependencies, and will be used as a java library.

Setting up a Node.js Application

Last but not least is our Node.js application.

Add an index.js file under node/services/kafka-user:

const {Worker} = require('worker_threads');

function JavaToJSNotifier() {
    this.queue = new java.util.concurrent.LinkedBlockingDeque();
    this.worker = new Worker(`
        const { workerData, parentPort } = require('worker_threads');
        while (true) {
          // block the worker waiting for the next notification from Java
          var data = workerData.queue.take();
          // notify the main event loop that we got new data 
          parentPort.postMessage(data);
        }`,
        { eval: true, workerData: { queue: this.queue }, stdout: true, stderr: true });
}

const config = {
    "bootstrap.servers": (process.env.VM === 'graalvm') ?'kafka-broker:9093' : 'localhost:9092'
}

const Consumer = Java.type('my.package.id.Consumer');
config.topic = "test_topic";
config['group.id'] = 'Test_Group'

const asyncJavaEvents = new JavaToJSNotifier();
asyncJavaEvents.worker.on('message', (n) => {
    console.log(`Got new data from Java! ${n}`);
});

const mConsumer = new Consumer(asyncJavaEvents.queue, JSON.stringify(config));
mConsumer.start();

index.js hosted with ❤ by GitHub

The code above creates and configures a new Kafka consumer, and then uses node’s experimental workers to create a new thread that listens to messages from that consumer. The consumer thread notifies the main thread when a new message arrives.

Notice the this.queue = new java.util.concurrent.LinkedBlockingDeque()on line 4. This is possible due to using the GraalVM image. This queue will be a shared instance with the Java consumer we previously defined.

Also, notice the const Consumer = Java.type('my.pakcage.id.Consumer')in line 20. Again this is possible due to GraalVM, and will hold a reference to our Java based Kafka consumer.

Running the code

The previously installed GraalVM image already contains node and GraalVM setup. If one wishes to run the node application on the host machine instead, installing and configuring GraalVM is required (instructions).

To run our code inside the container, open a terminal from the root folder and type docker-compose run graalvm sh.

This will open a shell within the GraalVM image.

Due to our configuration all of our compiled code and scripts will be located under the ./code folder.

Run the following command:

node --polyglot --jvm --jvm.cp=code/target/uber-kafka-client-1.0.jar -- experimental-worker code/node/services/kafka-user/index.js

This command will run our node application as a polyglot application in a JVM. Notice the — jvm.cp parameter that tells JVM where to find our Java based Kafka client.

Trying it out

Keep the terminal open, go back to the Java IDE, and run the Producer.main procedure.

You should now see the following printed in you terminal:

Success!!

Summary

GraalVM makes writing polyglot applications easy. Adding a docker infrastructure, makes it even easier to develop and run cross-language applications just about anywhere.

The possibilities are virtually endless.

I hope this helps some of you and maybe inspires you to create some cross-language solutions to a real life problem you are facing.

#java #node-js

Using a Java Based Kafka Client in a Node.js Application
78.10 GEEK