Kaustav Hazra


Integrate Kafka and MQTT - How to connect MQTT and Kafka

MQTT and Kafka can be integrated. We will show 5 different options on how to connect MQTT & Kafka. We will implement 2 of these options together.

The code can be found on GitHub:


Subscribe: https://www.youtube.com/channel/UCyXsLg7-8OKM7Cc0dsmntDA

#kafka #mqtt #python

What is GEEK

Buddha Community

Integrate Kafka and MQTT - How to connect MQTT and Kafka

Kafka for XML Message Integration and Processing

XML messages and XML Schema are not very common in the Apache Kafka and Event Streaming world! Why? Many people call XML legacy. It is complex, verbose, and often associated with the ugly WS-* Hell (SOAP, WSDL, etc). On the other side, every company older than five years uses XML. It is well understood, provides a good structure, and is human- and machine-readable.

This post does not want to start another flame war between XML and other technologies such as JSON (which also provides JSON Schema now), Avro, or Protobuf. Instead, I will walk you through the three main approaches to integrate between Kafka and XML messages as there is still a vast demand for implementing this integration today (often for integrating legacy applications and middleware).

XML and XML Schema

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium’s XML 1.0 Specification of 1998 and several other related specifications — all of them free open standards — define XML.

The design goals of XML emphasize simplicity, generality, and usability across the Internet. It is a textual data format with strong support via Unicode for different human languages. Although the design of XML focuses on documents, the language is widely used for the representation of arbitrary data structures such as those used in web services. Several schema systems exist to aid in defining XML-based languages, while programmers have developed many application programming interfaces (APIs) to assist the processing of XML data.

#open source #big data #integration #xml #json #kafka #middleware #event streaming #kafka connect platform #kafka connectors

Kaustav Hazra


Integrate Kafka and MQTT - How to connect MQTT and Kafka

MQTT and Kafka can be integrated. We will show 5 different options on how to connect MQTT & Kafka. We will implement 2 of these options together.

The code can be found on GitHub:


Subscribe: https://www.youtube.com/channel/UCyXsLg7-8OKM7Cc0dsmntDA

#kafka #mqtt #python

Obie  Rowe

Obie Rowe


Apache Kafka and SAP ERP Integration Options

A question I get every week from customers across the globe is, “How can I integrate my SAP system with Apache Kafka?” This post explores various alternatives, including connectors, third party tools, custom glue code, and trade-offs between the different options.

After exploring what SAP is, I will discuss several integration options between Apache Kafka and SAP systems:

  • Traditional middleware (ETL/ESB)
  • Web services (SOAP/REST)
  • 3rd party turnkey solutions
  • Kafka-native connectivity with Kafka Connect
  • Custom glue code using SAP SDKs
Disclaimer before you read on:

I am not an SAP expert. It is tough to stay up-to-date with the vast and complex ecosystem of SAP products, (re-)brands, versions, services, SDKs, and APIs. I am sorry if some of the below information is not 100% accurate or outdated. Always double-check on the SAP website (if the links from Google still work, I had some issues with some pages “no longer available” while researching for this blog post). If you see any inaccurate or missing information, please let me know and I will update the blog post.

What is SAP?

SAP is a German multinational software corporation that makes enterprise software to manage business operations and customer relations.

It is quite interesting: Nobody asks how to integrate with IBM or Oracle. Instead, people more specifically ask how to integrate with IBM MQ, IBM DB2, IBM Mainframe (still very ambiguous), or any other of the hundreds of IBM products.

For SAP, people ask: “How can I integrate with SAP?” Let’s clarify what SAP is before exploring integration options.

The company is primarily known for its ERP software. But if you check out the official “What is SAP?” page, you find out that SAP offers solutions across a wide range of areas:

  • ERP and Finance
  • CRM and Customer Experience
  • Network and Spend Management
  • Digital Supply Chain
  • HR and People Engagement
  • Experience Management
  • Business Technology Platform
  • Digital Transformation
  • Small and Midsize Enterprises
  • Industry Solutions

SAP’s Software Portfolio

SAP’s stack includes homegrown products like SAP ERP and acquisitions with their own codebase, including Ariba for supplier network, hybris for e-commerce solutions, Concur for travel & expense management, and Qualtrics for experience management.

Even if you talk about SAP ERP, the situation is still not that easy. Most companies still run SAP ERP Central Component (ECC, formerly called SAP R/3), SAP’s sophisticated (and aged) ERP product. ECC runs on a third-party relational database from Oracle, IBM, or Microsoft, while HANA is SAP’s in-memory database. The new ERP product is SAP S4/Hana (no, this is not just the famous in-memory database). Oh, and there is SAP S4/Hana Cloud. And before you wonder: No, this is not the same feature set as the on-premise version!

Various interfaces exist depending on your product. An interface can be an (awful) proprietary technologies like BAPI or iDoc, (okayish) standards-based web service APIs using SOAP or REST / HTTP, a (non-scalable) JDBC database connectivity, or if you are lucky even a (scalable and real-time) Event/Messaging API.

#integration #kafka #sap #middleware #sap erp #sap hana #kafka connect platform #kafka connectors #bapi #idoc

PostgreSQL Connection Pooling: Part 4 – PgBouncer vs. Pgpool-II

In our previous posts in this series, we spoke at length about using PgBouncer  and Pgpool-II , the connection pool architecture and pros and cons of leveraging one for your PostgreSQL deployment. In our final post, we will put them head-to-head in a detailed feature comparison and compare the results of PgBouncer vs. Pgpool-II performance for your PostgreSQL hosting !

The bottom line – Pgpool-II is a great tool if you need load-balancing and high availability. Connection pooling is almost a bonus you get alongside. PgBouncer does only one thing, but does it really well. If the objective is to limit the number of connections and reduce resource consumption, PgBouncer wins hands down.

It is also perfectly fine to use both PgBouncer and Pgpool-II in a chain – you can have a PgBouncer to provide connection pooling, which talks to a Pgpool-II instance that provides high availability and load balancing. This gives you the best of both worlds!

Using PgBouncer with Pgpool-II - Connection Pooling Diagram

PostgreSQL Connection Pooling: Part 4 – PgBouncer vs. Pgpool-II


Performance Testing

While PgBouncer may seem to be the better option in theory, theory can often be misleading. So, we pitted the two connection poolers head-to-head, using the standard pgbench tool, to see which one provides better transactions per second throughput through a benchmark test. For good measure, we ran the same tests without a connection pooler too.

Testing Conditions

All of the PostgreSQL benchmark tests were run under the following conditions:

  1. Initialized pgbench using a scale factor of 100.
  2. Disabled auto-vacuuming on the PostgreSQL instance to prevent interference.
  3. No other workload was working at the time.
  4. Used the default pgbench script to run the tests.
  5. Used default settings for both PgBouncer and Pgpool-II, except max_children*. All PostgreSQL limits were also set to their defaults.
  6. All tests ran as a single thread, on a single-CPU, 2-core machine, for a duration of 5 minutes.
  7. Forced pgbench to create a new connection for each transaction using the -C option. This emulates modern web application workloads and is the whole reason to use a pooler!

We ran each iteration for 5 minutes to ensure any noise averaged out. Here is how the middleware was installed:

  • For PgBouncer, we installed it on the same box as the PostgreSQL server(s). This is the configuration we use in our managed PostgreSQL clusters. Since PgBouncer is a very light-weight process, installing it on the box has no impact on overall performance.
  • For Pgpool-II, we tested both when the Pgpool-II instance was installed on the same machine as PostgreSQL (on box column), and when it was installed on a different machine (off box column). As expected, the performance is much better when Pgpool-II is off the box as it doesn’t have to compete with the PostgreSQL server for resources.

Throughput Benchmark

Here are the transactions per second (TPS) results for each scenario across a range of number of clients:

#database #developer #performance #postgresql #connection control #connection pooler #connection pooler performance #connection queue #high availability #load balancing #number of connections #performance testing #pgbench #pgbouncer #pgbouncer and pgpool-ii #pgbouncer vs pgpool #pgpool-ii #pooling modes #postgresql connection pooling #postgresql limits #resource consumption #throughput benchmark #transactions per second #without pooling

Ruth  Nabimanya

Ruth Nabimanya


Tips About Kafka Connect On Heroku You Can't Afford To Miss


With ever-increasing demands from other business units, IT departments have to be constantly looking for service improvements and cost-saving opportunities. This article showcases several concrete use-cases for companies that are investigating or already using Kafka, in particular, Kafka Connect.

Kafka Connect is an enterprise-grade solution for integrating a plethora of applications, ranging from traditional databases to business applications like Salesforce and SAP. Possible integration scenarios range from continuously streaming events and data between applications to large-scale, configurable batch jobs that can be used to replace manual data transfers.

#kafka-connect #kafka #heroku #database #database-architecture #apache-kafka #tutorial #cluster