This is part one of three.

When we migrated our Big Data solutions to the Google Cloud Platform (GCP), around the beginning of 2019, we were somehow worried about how the costs could vary widely depending on how we would organize it and also how our internal users would make use of the data provided, mostly in BigQuery (BQ). Also, we did not want to have our internal users worried about costs when using the data, but at the same time, we also did not want that silly mistakes could cost us more than necessary, given that we had opted for an on-demand cost model.

So this post is the story about how we approached the cost issue.

Image for postImage for post

https://pixabay.com/photos/savings-budget-investment-money-2789137/

I want to first tell you what we did, in what order and why. Also, in case we had made any mistakes, then I will share my thoughts about them and also touch on what we were thinking and what we now consider as improvements.

It is important to say, though, that no one on the team was experienced with GCP at the time, and even after more than one year using it, I’m sure we still have much more to learn.

Training internal users

Most of the cost sources would be managed directly by the team, except for what could become a big chunk of our costs, which is BigQuery. We have product analysts, data scientists, business analysts and many others that make use of the data generated by our processes. At the time of the migration, they were all used to Hive, and with the configuration that we had at the time, nothing they could do would generate additional costs. Now, the task was to both ease their transition to BQ and to instruct them on how to be now cost-aware.

#bigquery #monitoring #cost #google-cloud-platform #big-data

Big Data in Google Cloud — Cost Monitoring
1.50 GEEK