Disclaimer: This is ONLY my weekend research and development topic; the idea behind is to learn more about the Hadoop application/job logs and Rust language. The solution is NOT for production used. Use it at your own risk.

Motivation

Everything starts with “Why.” I have been in the Big Data area for years. One of the critical pain point over the years, it’s the Hadoop job monitoring. And you will say: “wait, there is a lot of tools over there.” Yes, I agree with you. However, if you put into a company daily operation standpoint, you will have a lot of challenges on:

  1. How to get one single view for the platform/application development or operation team, without overhead monitoring tools for different groups.
  2. How to make sure to store those metrics consistently with the full capabilities for the historical analysis view? And even apply ML on top of it to drive the compatible platform or job level operation improvements.
  3. How to not get lost in all the new fancy tooling around the monitoring?
  4. How to increase the tooling usage rate and the Return on Investment (R0I) rate?

Based on those why and how, I’m leveraging this long weekend to start some exciting R&D (Research and Development).

Keep learning and fail fast

This weekend build and learn, of course, with my current favorable language Rust.

#hadoop #sqlite #rust #research-and-development #hdfs

Monitoring the Hadoop Application/Job differently with Rust
2.75 GEEK