Bryan Jacky

Bryan Jacky


Manage Files in Google Cloud Storage With Python

I recently worked on a project which combined two of my life's greatest passions: coding, and memes. The project was, of course, a chatbot: a fun imaginary friend who sits in your chatroom of choice and loyally waits on your beck and call, delivering memes whenever you might request them. In some cases, the bot would scrape the internet for freshly baked memes, but there were also plenty of instances where the desired memes should be more predictable, namely from a predetermined subset of memes hosted on the cloud which could be updated dynamically. This is where Google Cloud Storage comes in.

Google Cloud Storage is an excellent alternative to S3 for any GCP fanboys out there. Google Cloud provides a dead-simple way of interacting with Cloud Storage via the google-cloud-storage Python SDK: a Python library I've found myself preferring over the clunkier Boto3 library.

We've actually touched on google-cloud-storage briefly when we walked through interacting with BigQuery programmatically, but there's enough functionality available in this library to justify a post in itself.

Getting Set Up

Setting up a Google Cloud bucket is simple enough to skip the details, but there are a couple of things worth mentioning. First on our list: we need to set our bucket's permissions.

Setting Bucket-level Permissions

Making buckets publicly accessible is a big no-no in the vast majority of cases; we should never make a bucket containing sensitive information public (unless you're a contractor for the US government and you decide to store the personal information of all US voters in a public S3 bucket - that's apparently okay). Since I'm working with memes which I've stolen from other sources, I don't mind this bucket being publicly accessible.

Bucket-level permissions aren't enabled on new buckets by default (new buckets abide by object-level permissions). Changing this can be a bit tricky to find at first: we need to click into our bucket of choice and note the prompt at the top of the screen:

New buckets should prompt you for bucket-level permissions.

Clicking "enable" will open a side panel on the right-hand side of your screen. To enable publicly viewable files, we need to attach the Storage Object Viewer role to a keyword called allUsers (allUsers is a reserved type of "member" meaning "everybody in the entire world).

Finding Our Bucket Info

When we access our bucket programmatically, we'll need some information about our bucket like our bucket's URL (we need this to actually know where items in our bucket will be stored). General information about our bucket can be found under the "overview" tab, take this down:

To access the files we modify in our bucket, you'll need to know the URL.

Generating a Service Key

Finally, we need to generate a JSON service key to grant permissions to our script. Check out the credentials page in your GCP console and download a JSON file containing your creds. Please remember not to commit this anywhere.

Configuring our Script

Let's start coding, shall we? Make sure the google-cloud-storage library is installed on your machine with pip3 install google-cloud-storage.

I'm going to set up our project with a file containing relevant information we'll need to work with:

"""Google Cloud Storage Configuration."""
from os import environ

Google Cloud Storage

bucketName = environ.get(‘GCP_BUCKET_NAME’)
bucketFolder = environ.get(‘GCP_BUCKET_FOLDER_NAME’)


localFolder = environ.get(‘LOCAL_FOLDER’)

  • bucketName is our bucket’s given name. The google-cloud-storage interacts with buckets by looking for buckets which match a name in your GCP account.
  • bucketFolder is a folder within our bucket that we’ll be working with.
  • localFolder is where I’m keeping a bunch of local files to test uploading and downloading to GCP.

With that done, we can start our script by importing these values:

“”“Programatically interact with a Google Cloud Storage bucket.”“”
from import storage
from config import bucketName, localFolder, bucketFolder

Managing Files in a GCP Bucket

Before we do anything, we need to create an object representing our bucket. I’m creating a global variable named bucket. This is created by calling the get_bucket() method on our storage client and passing the name of our bucket:

“”“Programatically interact with a Google Cloud Storage bucket.”“”
from import storage
from config import bucketName, localFolder, bucketFolder

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucketName)

To demonstrate how to interact with Google Cloud Storage, we’re going to create 5 different functions to handle common tasks: uploading, downloading, listing, deleting, and renaming files.

Upload Files

Our first function will look at a local folder on our machine and upload the contents of that folder:

from os import listdir
from os.path import isfile, join

def upload_files(bucketName):
“”“Upload files to GCP bucket.”“”
files = [f for f in listdir(localFolder) if isfile(join(localFolder, f))]
for file in files:
localFile = localFolder + file
blob = bucket.blob(bucketFolder + file)
return f’Uploaded {files} to “{bucketName}” bucket.’

The first thing we do is fetch all the files we have living in our local folder using listdir(). We verify that each item we fetch is a file (not a folder) by using isfile().

We then loop through each file in our array of files. We set the desired destination of each file using bucket.blob(), which accepts the desired file path where our file will live once uploaded to GCP. We then upload the file with blob.upload_from_filename(localFile):

Uploaded [‘sample_csv.csv’, ‘sample_text.txt’, ‘peas.jpg’, ‘sample_image.jpg’] to “hackers-data” bucket.

Listing Files

Knowing which files exist in our bucket is obviously important:

def list_files(bucketName):
“”“List all files in GCP bucket.”“”
files = bucket.list_blobs(prefix=bucketFolder)
fileList = [ for file in files if ‘.’ in]
return fileList

list_blobs() gets us a list of files in our bucket. By default this will return all files; we can restrict the files we want to list to those in a bucket by specifying the prefix attribute.

[‘storage-tutorial/sample_csv.csv’, ‘storage-tutorial/sample_image.jpg’, ‘storage-tutorial/sample_text.txt’, ‘storage-tutorial/test.csv’]

Looks like test.csv lives in our bucket, but not in our local folder!

Downloading Files

A feature of the chatbot I built was to fetch a randomized meme per meme keyword. Let’s see how’d we’d accomplish this:

from random import randint

def download_random_file(bucketName, bucketFolder, localFolder):
“”“Download random file from GCP bucket.”“”
fileList = list_files(bucketName)
rand = randint(0, len(fileList) - 1)
blob = bucket.blob(fileList[rand])
fileName =‘/’)[-1]
blob.download_to_filename(localFolder + fileName)
return f’{fileName} downloaded from bucket.’

We leverage the list_files() function we already created to get a list of items in our bucket. We then select a random item by generating a random index using randint.

It’s important to note here that .blob() returns a “blob” object as opposed to a string (inspecting our blob with type() results in <class ‘’>). This is why we see come into play when setting our blob’s filename.

Finally, we download our target file with download_to_filename().

Deleting Files

Deleting a file is as simple as .delete_blob:

def delete_file(bucketName, bucketFolder, fileName):
“”“Delete file from GCP bucket.”“”
bucket.delete_blob(bucketFolder + fileName)
return f’{fileName} deleted from bucket.’

Renaming Files

To rename a file, we pass a blob object to rename_blob() and set the new name via the new_name attribute:

def rename_file(bucketName, bucketFolder, fileName, newFileName):
“”“Rename file in GCP bucket.”“”
blob = bucket.blob(bucketFolder + fileName)
return f’{fileName} renamed to {newFileName}.’

Managing Buckets

We can also use google-cloud-storage to interact with entire buckets:

  • create_bucket(‘my_bucket_name’) creates a new bucket with the given name.
  • bucket.delete() deletes an existing bucket.

There are also ways to programmatically do things like access details about a bucket, or delete all the objects inside a bucket. Unfortunately, these actions are only supported by the REST API. I don’t find these actions particularly useful anyway, so whatever.

The source code for this tutorial can be found here. That’s all, folks!

Originally published by Todd Birchard at


Thanks for reading :heart: If you liked this post, share it with all of your programming buddies! Follow me on Facebook | Twitter

Learn More

☞ Complete Python Bootcamp: Go from zero to hero in Python 3

☞ Python for Time Series Data Analysis

☞ Python Programming For Beginners From Scratch

☞ Python Network Programming | Network Apps & Hacking Tools

☞ Intro To SQLite Databases for Python Programming

☞ Ethical Hacking With Python, JavaScript and Kali Linux

☞ Beginner’s guide on Python: Learn python from scratch! (New)

☞ Python for Beginners: Complete Python Programming


What is GEEK

Buddha Community

Manage Files in Google Cloud Storage With Python
Adaline  Kulas

Adaline Kulas


Multi-cloud Spending: 8 Tips To Lower Cost

A multi-cloud approach is nothing but leveraging two or more cloud platforms for meeting the various business requirements of an enterprise. The multi-cloud IT environment incorporates different clouds from multiple vendors and negates the dependence on a single public cloud service provider. Thus enterprises can choose specific services from multiple public clouds and reap the benefits of each.

Given its affordability and agility, most enterprises opt for a multi-cloud approach in cloud computing now. A 2018 survey on the public cloud services market points out that 81% of the respondents use services from two or more providers. Subsequently, the cloud computing services market has reported incredible growth in recent times. The worldwide public cloud services market is all set to reach $500 billion in the next four years, according to IDC.

By choosing multi-cloud solutions strategically, enterprises can optimize the benefits of cloud computing and aim for some key competitive advantages. They can avoid the lengthy and cumbersome processes involved in buying, installing and testing high-priced systems. The IaaS and PaaS solutions have become a windfall for the enterprise’s budget as it does not incur huge up-front capital expenditure.

However, cost optimization is still a challenge while facilitating a multi-cloud environment and a large number of enterprises end up overpaying with or without realizing it. The below-mentioned tips would help you ensure the money is spent wisely on cloud computing services.

  • Deactivate underused or unattached resources

Most organizations tend to get wrong with simple things which turn out to be the root cause for needless spending and resource wastage. The first step to cost optimization in your cloud strategy is to identify underutilized resources that you have been paying for.

Enterprises often continue to pay for resources that have been purchased earlier but are no longer useful. Identifying such unused and unattached resources and deactivating it on a regular basis brings you one step closer to cost optimization. If needed, you can deploy automated cloud management tools that are largely helpful in providing the analytics needed to optimize the cloud spending and cut costs on an ongoing basis.

  • Figure out idle instances

Another key cost optimization strategy is to identify the idle computing instances and consolidate them into fewer instances. An idle computing instance may require a CPU utilization level of 1-5%, but you may be billed by the service provider for 100% for the same instance.

Every enterprise will have such non-production instances that constitute unnecessary storage space and lead to overpaying. Re-evaluating your resource allocations regularly and removing unnecessary storage may help you save money significantly. Resource allocation is not only a matter of CPU and memory but also it is linked to the storage, network, and various other factors.

  • Deploy monitoring mechanisms

The key to efficient cost reduction in cloud computing technology lies in proactive monitoring. A comprehensive view of the cloud usage helps enterprises to monitor and minimize unnecessary spending. You can make use of various mechanisms for monitoring computing demand.

For instance, you can use a heatmap to understand the highs and lows in computing visually. This heat map indicates the start and stop times which in turn lead to reduced costs. You can also deploy automated tools that help organizations to schedule instances to start and stop. By following a heatmap, you can understand whether it is safe to shut down servers on holidays or weekends.

#cloud computing services #all #hybrid cloud #cloud #multi-cloud strategy #cloud spend #multi-cloud spending #multi cloud adoption #why multi cloud #multi cloud trends #multi cloud companies #multi cloud research #multi cloud market

Zelma  Gerlach

Zelma Gerlach


A python package to manage paths on Google Cloud Storage

Are you used to pathlib’s Path objects and frustrated when using GCSFileSystem objects ? The TransparentPath package is made for you.

What is it good for and how to use it

When I first started using Google Cloud Plateform (GCP), I faced the following difficulty : reading and writing files from/to Google Cloud Storage (GCS) easily in a Python code. For local files, I am used to the Pathlib library, that makes using paths really easy and intuitive by overloading truediv_ to allow bash-like paths creation, and by implementing a lot of useful methods like globlsunlink … that can be called on the path object directly. The class that allows one to use paths on GCS in Python is GCSFileSystem, from the package gcsfs, and it does not have all those handy features, for the main object will not be a file but a file system.

#python3 #google-cloud-platform #cloud-storage #google-cloud-storage #cloud

Google Cloud: Caching Cloud Storage content with Cloud CDN

In this Lab, we will configure Cloud Content Delivery Network (Cloud CDN) for a Cloud Storage bucket and verify caching of an image. Cloud CDN uses Google’s globally distributed edge points of presence to cache HTTP(S) load-balanced content close to our users. Caching content at the edges of Google’s network provides faster delivery of content to our users while reducing serving costs.

For an up-to-date list of Google’s Cloud CDN cache sites, see

Task 1. Create and populate a Cloud Storage bucket

Cloud CDN content can originate from different types of backends:

  • Compute Engine virtual machine (VM) instance groups
  • Zonal network endpoint groups (NEGs)
  • Internet network endpoint groups (NEGs), for endpoints that are outside of Google Cloud (also known as custom origins)
  • Google Cloud Storage buckets

In this lab, we will configure a Cloud Storage bucket as the backend.

#google-cloud #google-cloud-platform #cloud #cloud storage #cloud cdn

Rusty  Shanahan

Rusty Shanahan


Overview of Google Cloud Essentials Quest

If you looking to learn about Google Cloud in depth or in general with or without any prior knowledge in cloud computing, then you should definitely check this quest out, Link.

Google Could Essentials is an introductory level Quest which is useful to learn about the basic fundamentals of Google Cloud. From writing Cloud Shell commands and deploying my first virtual machine, to running applications on Kubernetes Engine or with load balancing, Google Cloud Essentials is a prime introduction to the platform’s basic features.

Let’s see what was the Quest Outline:

  1. A Tour of Qwiklabs and Google Cloud
  2. Creating a Virtual Machine
  3. Getting Started with Cloud Shell & gcloud
  4. Kubernetes Engine: Qwik Start
  5. Set Up Network and HTTP Load Balancers

A Tour of Qwiklabs and Google Cloud was the first hands-on lab which basically gives an overview about Google Cloud. There were few questions to answers that will check your understanding about the topic and the rest was about accessing Google cloud console, projects in cloud console, roles and permissions, Cloud Shell and so on.

**Creating a Virtual Machine **was the second lab to create virtual machine and also connect NGINX web server to it. Compute Engine lets one create virtual machine whose resources live in certain regions or zones. NGINX web server is used as load balancer. The job of a load balancer is to distribute workloads across multiple computing resources. Creating these two along with a question would mark the end of the second lab.

#google-cloud-essentials #google #google-cloud #google-cloud-platform #cloud-computing #cloud

Extract RSS News Feeds using Python and Google Cloud Services

An Introduction to web scraping and to Serverless Cloud services.

The purpose of this article is to present au systematic approach to read an RSS News Feed and to process its content to web scrape news articles. The challenge is to be able to extract text articles published in different websites without any strong premise on a web page structure.

The overall solution is described in three steps :

  1. A message is published in Cloud Pub/Sub with a URL to an news RSS feed,
  2. A first Cloud Function is triggered by the previous message. It extracts each article within the RSS feed, stores it in Cloud Storage and publishes a message for each article in Cloud Pub/Sub for further usage,
  3. A second Cloud Function is triggered by the previous messages. It web scrapes the article page, stores the resulting text in Cloud Storage and publishes a message in Cloud Pub/Sub for further usage.

#google-cloud-platform #google-cloud-functions #google-cloud-pubsub #python #cloud