We, the Expedia GroupᵀᴹData Platform Team, are building the next-gen petabyte-scale data lake. This next stage in the evolution of our data lake is based on our Apiary data lake pattern and utilizes a number of our open-source components like Waggle-Dance, Circus-Train, etc. A Hive metastore (HMS) proxied by the Waggle Dance service is usually the first point of contact for a user query to discover and analyze data. That makes the Hive metastore a critical piece of infrastructure.
We utilize a number of Hive metastore listeners that are installed in the Hive metastore to enable a variety of event-based use cases such as Shunting Yard, Cloverleaf, Beekeeper, Ranger policies etc. Some of the open-source listeners that we use are:
#hadoop #big-data #data #software-engineering #hive