We, the Expedia GroupᵀᴹData Platform Team, are building the next-gen petabyte-scale data lake. This next stage in the evolution of our data lake is based on our Apiary data lake pattern and utilizes a number of our open-source components like Waggle-DanceCircus-Train, etc. A Hive metastore (HMS) proxied by the Waggle Dance service is usually the first point of contact for a user query to discover and analyze data. That makes the Hive metastore a critical piece of infrastructure.

We utilize a number of Hive metastore listeners that are installed in the Hive metastore to enable a variety of event-based use cases such as Shunting YardCloverleafBeekeeper, Ranger policies etc. Some of the open-source listeners that we use are:

#hadoop #big-data #data #software-engineering #hive

Drone Fly — Decoupling Event Listeners from the Hive Metastore
1.35 GEEK