python with docker async def SyntaxError: invalid syntax

I have python app in virtualenv that leverage async function calls

I have python app in virtualenv that leverage async function calls

Recently I tried moving my app to docker container, with following Dockerfile:

FROM python:3.7
ENV PYTHONUNBUFFERED 1
ADD . /code
WORKDIR /code

RUN pip install -r requirements.txt

CMD ["python3", "app.py"]

docker-compose.yml:

version: '3'
services:
app:
build: .
volumes:
- .:/code
ports:
- "8080:80"

unfortunately, when trying to run my container with docker-compose up I get following error:

    Starting app_1 ... done
Attaching to app_1
app_1 | File "app.py", line 31
app_1 | async def handle_DATA(self, server, session, envelope):
app_1 | ^
app_1 | SyntaxError: invalid syntax
app_1 exited with code 1

What might be cause of this? As far as I know, from python 3.5 async is built-in in stdlib

Kubernetes Fundamentals - Learn Kubernetes from the Beginning

Kubernetes Fundamentals - Learn Kubernetes from the Beginning

Kubernetes is about orchestrating containerized apps. Docker is great for your first few containers. As soon as you need to run on multiple machines and need to scale/up down and distribute the load and so on, you need an orchestrator - you need Kubernetes

Kubernetes is about orchestrating containerized apps. Docker is great for your first few containers. As soon as you need to run on multiple machines and need to scale/up down and distribute the load and so on, you need an orchestrator - you need Kubernetes

This is the first part of a series of articles on Kubernetes, cause this topic is BIG!.

  • Part I - From the beginning, Part I, Basics, Deployment and Minikube: In this part, we cover why Kubernetes, some history and some basic concepts like deploying, Nodes, Pods.
  • Part II - Introducing Services and Labeling: In this part, we deepen our knowledge of Pods and Nodes. We also introduce Services and labeling using labels to query our artifacts.
  • Part III - Scaling: Here we cover how to scale our app
  • Part IV - Auto scaling: In this part we look at how to set up auto-scaling so we can handle sudden large increases of incoming requests
Resources Kubernetes

So what do we know about Kubernetes?

It's an open-source system for automating deployment, scaling, and management of containerized applications

Let'start with the name. It's Greek for Helmsman, the person who steers the ship. Which is why the logo looks like this, a steering wheel on a boat:

It's Also called K8s so K ubernete s, 8 characters in the middle are removed. Now you can impress your friends that you know why it's referred to as K8.

Here is some more Jeopardy knowledge on its origin. Kubernetes was born out of systems called Borg and Omega. It was donated to CNCF, Cloud Native Computing Foundation in 2014. It's written in Go/Golang.

If we see past all this trivia knowledge, it was built by Google as a response to their own experience handling a ton of containers. It's also Open Source and battle-tested to handle really large systems, like planet-scale large systems.

So the sales pitch is:

Run billions of containers a week, Kubernetes can scale without increasing your ops team

Sounds amazing right, billions of containers cause we are all Google size. No? :) Well even if you have something like 10-100 containers, it's for you.

In this part I hope to cover the following:

  • Why Kubernetes and Orchestration in General
  • Hello world: Minikube basics, talking through Minikube, simple deploy example
  • Cluster and basic commands, Nodes,
  • Deployments, what it is and deploying an app
  • Pods and Nodes, explain concepts and troubleshooting
Part I - From the beginning, Part I, Basics, Deployment and Minikube Why Orchestration

Well, it all started with containers. Containers gave us the ability to create repeatable environments so dev, staging, and prod all looked and functioned the same way. We got predictability and they were also light-weight as they drew resources from the host operating system. Such a great breakthrough for Developers and Ops but the Container API is really only good for managing a few containers at a time. Larger systems might consist of 100s or 1000+ containers and needs to be managed as well so we can do things like scheduling, load balancing, distribution and more.

At this point, we need orchestration the ability for a system to handle all these container instances. This is where Kubernetes comes in.

Getting started

Ok ok, let's say I buy into all of this, how do I get started?

Impatient ey, sure let's start to do something practical with Minikube

Ok, sounds good I'm a coder, I like practical stuff. What is Minikube?

Minikube is a tool that lets us run Kubernetes locally

Oh, sweet, millions of containers on my little machine?

Well, no, let's start with a few and learn Kubernetes basics while at it.

Installation

To install Minikube lets go to this installation page

It's just a few short steps that means we install

  • a Hypervisor
  • Kubectl (Kube control tool)
  • Minikube

Run

Get that thing up and running by typing:

minikube start

It should look something like this:

You can also ensure that kubectl have been correctly installed and running:

kubectl version

Should give you something like this in response:

Ok, now we are ready to learn Kubernetes.

Learning kubectl and basic concepts

In learning Kubernetes lets do so by learning more about kubectl a command line program that lets us interact with our Cluster and lets us deploy and manage applications on said Cluster.

The word Cluster just means a group of similar things but in the context of Kubernetes, it means a Master and multiple worker machines called Nodes. Nodes were historically called Minions

, but not so anymore.

The master decides what will run on the Nodes, which includes things like scheduled workloads or containerized apps. Which brings us to our next command:

kubectl get nodes

This should give us a result like this:

What this tells us what Nodes we have available to do work.

Next up let's try to run our first app on Kubernetes with the run command like so:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

This should give us a response like so:

Next up lets check that everything is up and running with the command:

kubectl get deployments

This shows the following in the terminal:

In putting our app on the Kluster, by invoking the run command, Kubernetes performed a few things behind the scenes, it:

  • searched for a suitable node where an instance of the application could be run, there was only one node so it got chosen
  • scheduled the application to run on that Node
  • configured the cluster to reschedule the instance on a new Node when needed

Next up we are going to introduce the concept Pod, so what is a Pod?

A Pod is the smallest deployable unit and consists of one or many containers, for example, Docker containers. That's all we are going to say about Pods at the moment but if you really really want to know more have a read here

The reason for mentioning Pods at this point is that our container and app is placed inside of a Pod. Furthermore, Pods runs in a private isolated network that, although visible from other Pods and services, it cannot be accessed outside the network. Which means we can't reach our app with say a curl command.

We can change that though. There is more than one way to expose our application to the outside world for now however we will use a proxy.

Now open up a 2nd terminal window and type:

kubectl proxy

This will expose the kubectl as an API that we can query with HTTP request. The result should look like:

Instead of typing kubectl version we can now type curl http://localhost:8001/version and get the same results:

The API Server inside of Kubernetes have created an endpoint for each pod by its pod name. So the next step is to find out the pod name:

kubectl get pods

This will list all the pods you have, it should just be one pod at this point and look something like this:

Then you can just save that down to a variable like so:

Lastly, we can now do an HTTP call to learn more about our pod:

curl http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME

This will give us a long JSON response back (I trimmed it a bit but it goes on and on...)

Maybe that's not super interesting for us as app developers. We want to know how our app is doing. Best way to know that is looking at the logs. Let's do that with this command:

kubectl logs $POD_NAME

As you can see below we know get logs from our app:

Now that we know the Pods name we can do all sorts of things like checking its environment variables or even step inside the container and look at the content.

kubectl exec $POD_NAME env

This yields the following result:

Now lets step inside the container:

kubectl exec -ti $POD_NAME bash

We are inside! This means we can see what the source code looks like even:

cat server.js

Inside of our container, we can now reach the running app by typing:

curl http://localhost:8080

Summary Part I

This is where we will stop for now.
What did we actually learn?

  • Kubernetes, its origin what it is
  • Orchestration why you will soon need it
  • Concepts like Master, Nodes and Pods
  • Minikube, kubectl and how to deploy an image onto our Cluster
Part II - Introducing Services and Labeling

In this part we will cover the following:

  • Deepen our knowledge on Pods and Nodes
  • Introduce Services and labeling
  • Perform an exercise, involving setting labels on Pods and use labels to query our artifacts
Concepts revisited

When we create a Deployment on Kubernetes, that Deployment creates Pods with containers inside them. So Pods are tied to Nodes and will continue to exist until terminated or deleted. Let's try to educate ourselves a bit more on Pods, Nodes and let's also introduce a new topic Services.

Pods

Pods are the atomic unit on the Kubernetes platform, i.e smallest possible deployable unit

We've stated the above before but it's worth mentioning again.

What else is there to know?

A Pod is an abstraction that represents a group of one or more containers, for example, Docker or rkt, and some shared resources for those containers. Those resources include:

  • Shared storage, as Volumes
  • Networking, as a unique cluster IP address
  • Information about how to run each container, such as the container image version or specific ports to use

A Pod can have more than one container. If it does contain more than one container it is so the other containers can support the primary application.
Typical examples of helper applications are data pullers, data pushers, and proxies. You can read more on that use case here

  1. The containers in a Pod share an IP Address and port space and are:
  2. Always co-located
  3. Co-scheduled

Let me show you an image to make it easier to visualize:

As we can see above a Pod can have a lot of different artifacts in them that are able to communicate and support the app in some way.

Nodes

A Pod always runs on a Node

So Node is the Pods parent?

Yes.

A Node is a worker machine and may be either a virtual or a physical machine, depending on the cluster

Each Node is managed by the Master. A Node can have multiple pods.

So it's a one to many relationship

The Kubernetes master automatically handles scheduling the pods across the Nodes in the cluster

Every Kubernetes Node runs at least a:

  • Kubelet, is responsible for the pod spec and talks to the cri interface

  • Kube proxy, is the main interface for coms between nodes

  • A container runtime, (like Docker, rkt) responsible for pulling the container image from a registry, unpacking the container, and running the application.

Ok so a Node contains a Kubelet and container runtime and one to many Pods. I think I got it.

Let's show an image to make this info stick, cause it's quite important that we know what goes on, at least at a high level:

Services

Pods are mortal, they can die. Pods, in fact, have a lifecycle.

When a worker node dies, the Pods running on the Node are also lost.

What happens to our apps? :(

You might think them and their data are lost but not so. The whole point with Kubernetes is to not let that happen. We normally deploy something like a ReplicaSet.

A ReplicaSet, what do you mean?

A ReplicaSet is a high-level artifact that can drive the cluster back to desired state via the creation of new Pods to keep your application running.

Ok so if a Pod goes down the ReplicaSet just creates a new Pod/s in its place?

Yes, exactly that. If you focus on defining a desired state the rest is up to Kubernetes.

Phew sounds really great then.

This concept of desired state is a very important one. You need to specify how many containers you want of each kind, at all times.

Oh so 4 database containers, 3 services etc?

Yes exactly.

So you don't have to care about the details just tell Kubernetes what state you want and it does the rest. If something goes up, Kubernetes ensures it comes back up again to desired state.

Each Pod in a Kubernetes cluster has a unique IP address, even Pods on the same Node, so there needs to be a way of automatically reconciling changes among Pods so that your applications continue to function.

Ok?

Yea, think like this. If a Pod containing your app goes down and another Pod is created in its place, running your app. Users should still be able to use your app after that.

Ok I got it. Makes me think...

The motivation for a Service

You should never refer to a Pod by it's IP address, just think what happens when a Pod goes down and comes back up again but this time with a different IP. It is for that reason a Service exists.

A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them.

Makes me think of a routers and subnets

Yea I guess you can say there is a resemblance in there somewhere.

Services enable a loose coupling between dependent Pods and are defined using YAML or JSON file, just like all Kubernetes objects.

That's handy, just JSON and YAML :)

Services and Labels

The set of Pods targeted by a Service is usually determined by a LabelSelector.

Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. We can expose them through a proxy though as we showed in part I.

Wait, go back a second here, you said LabelSelector. I wasn't quite following?

Remember how we couldn't refer to Pods by IP, cause Pods might go down and a new Pod could come back in its place?

Yes

Well, labels are the answer to how Services and Pods are able to communicate. This is what we mean by loose coupling. By applying labels like for example frontend, backend, release and so on to Pods, we are able to refer to Pods by their logical name rather than their specifics, i.e IP number.

Oh I get it, so it's a high-level domain language

Mm, kind of.

Services and Traffic

Services allow your applications to receive traffic.

Services can be exposed in different ways by specifying a type in ServiceSpec, service specification.

  • ClusterIP (default) - Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
  • NodePort - Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using :. Superset of ClusterIP.
  • LoadBalancer - Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
  • ExternalName - Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.

Ok I think I get it. Ensure I'm speaking externally to a Service instead of specific Pods. Depending on what I expose the Service as, that leads to different behavior?

Yea that's correct.

You said something about labels though, how do we create and apply them to Pods?

Yea lets talk about that next.

Labels

As we just mentioned, Services are the abstraction that allows pods to die and replicate in Kubernetes without impacting your application.

Now, Services match a set of Pods using labels and selectors, it allows us to operate on Pods like a group.

Labels are key/value pairs attached to objects and can be used in any number of ways:

  • Designate objects for development, test, and production
  • Embed version tags
  • Classify an object using tags

Labels can be attached to objects at creation time or later on. They can be modified at any time.

Lab - Fun with Labels and kubectl

It's a good idea to have read the first part of this series where we create a deployment. If you haven't you need to first create a deployment like so:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

Now we should be good to go.

Ok. I know you are probably all tired from all theory by now.

I bet you are just itching to learn more hands on Kubernetes with kubectl.

Well, the time for that has come :). We will do two things:

  1. Create a Service and learn how we can expose our app using said Service
  2. Learn about Labeling and how we can improve our querying game by having appropriate labels on our artifacts.

Let's create a new service.

We will get acquainted with the expose command.

Let's check for existing pods,

kubectl get pods

Next let's see what services we have:

kubectl get services

Next lets create a Service like so:

kubectl expose deployment/kubernetes-first-app --type="NodePort" --port 8080

As you can see above we are just targeting one of our deployments kubernetes-first-app and referring to it with [type]/[deployment name] and type being deployment.

We expose it as service of type NodePort and finally, we choose to expose it at port 8080.

Now run kubectl get services again and see the results:

As you can see we now have two services in use, our basic kubernetes service and our newly created kubernetes-first-app.

Next up we need to grab the port of our service and assign that to a variable:

export NODE_PORT=$(kubectl get services/kubernetes-first-app -o go-template='{{(index .spec.ports 0).nodePort}}')
echo NODE_PORT=$NODE_PORT

We now have a our port stored on environment variable NODE_PORT and we are ready to start communicating with our service like so:

curl $(minikube ip):$NODE_PORT

Which leads to the following output:

Creating and applying Labels

When we created our deployment and our Pod, it was automatically assigned with a label.

By typing

kubectl describe deployment

we can see the name of said label.

Next up we can query the pods by that same label

kubectl get pods -l run=kubernetes-first-app

Above we are using -l to query for a specific label and kubernetes-bootcamp as the name of the label. This gives us the following result:

You can do a similar query to your services:

kubectl get services -l run=kubernetes-first-app

That just shows that you can query on different levels, for specific Pods or Services that have Pods with that label.

Next up we will look at how to change the label

First let's get the name of the pod, like so:

POD_NAME=kubernetes-first-app-669789f4f8-6glpx

Above I'm just assigning what my Pod is called to a variable POD_NAME. Check with a kubectl getpods what your Pod is called.

Then we can add/apply the new label like so:

kubectl label pod $POD_NAME app=v1

Verify that the new label have been set, like so:

kubectl describe pod

or

kubectl describe pods $POD_NAME

As you can see from the result our new label app=v1 has been appended to existing labels.

Now we can query like so:

kubectl get pods -l app=v1

That's pretty much how labeling works, how to get available labels, apply them and use them in a query. Ensure to give them a descriptive name like an app version, a certain environment or a name like frontend or backend, something that makes sense to your situation.

Clean up

Ok, so we created a service. We should learn how to clean up after ourselves. Run the following command to remove our service:

kubectl delete service -l run=kubernetes-bootcamp

Verify the service is no longer there with:

kubectl get services

also, ensure our exposed IP and port can no longer be reached:

curl $(minikube ip):$NODE_PORT

Just because the service is gone doesn't mean the app is gone. The app should still be reachable on:

kubectl exec -ti $POD_NAME curl localhost:8080

Summary Part II

So what did we learn? We learned a bit more on Pods and Nodes. Furthermore, we learned that we shouldn't speak directly to Pods but rather use a high-level abstraction such as Services. Services use labels as a way to define a domain language and apply those to different Pods.

Ok, so we understand a bit more on Kubernetes and how different concepts relate. We mentioned something called desired state a number of times but we didn't go into detail on how to set such a state. That's our next part in this series where we will cover how to set the desired state and how Kubernetes maintains it, so stay tuned.

Part III - Scaling

This third part aims to show how you scale your application. We can easily set the number of Replicas we want of a certain application and let Kubernetes figure out how to do that. This is us defining a so-called desired state.

When traffic increases, we will need to scale the application to keep up with user demand. We've talked about deployments and services, now lets talk scaling.

What does scaling mean in the context of Kubernetes?

We get more Pods. More Pods that are scheduled to nodes.

Now it's time to talk about desired state again, that we mentioned in previous parts.

This is where we relinquish control to Kubernetes. All we need to do is tell Kubernetes how many Pods we want and Kubernetes does the rest.

So we tell Kubernetes about the number of Pods we want, what does that mean? What does Kubernetes do for us?

It means we get multiple instances of our application. It also means traffic is being distributed to all of our Pods, ie. load balancing.

Furthermore, Kubernetes, or more specifically, services within Kubernetes will monitor which Pods are available and send traffic to those Pods.

Scaling demo Lab

If you haven't followed the first two parts I do recommend you go back and have a read. What you need for the following to work is at least a deployment. So if you haven't created one, here is how:

kubectl run kubernetes-first-app --image=gcr.io/google-samples/kubernetes-bootcamp:v1 --port=8080

Let's have a look at our deployments:

kubectl get deployments

Let's look closer at the response we get:

We have three pieces of information that are important to us. First, we have the READY column in which we should read the value in the following way, CURRENT STATE/DESIRED STATE. Next up is the UP_TO_DATE column which shows the number of replicas that were updated to match the desired state.
Lastly, we have the AVAILABLE column that shows how many replicas we have available to do work.

Let's scale

Now, let's do some scaling. For that we will use the scale command like so:

kubectl scale deployments/kubernetes-first-app --replicas=4 

as we can see above the number of replicas was increased to 4 and kubernetes is thereby ready to load balance any incoming requests.

Let's have a look at our Pods next:

When we asked for 4 replicas we got 4 Pods.

We can see that this scaling operation took place by using the describe command, like so:

kubectl describe deployments/kubernetes-first-app

In the above image, we are given quite a lot of information on our Replicas for example, but there is some other information in there that we will explain later on.

Does it load balance?

The whole point with the scaling was so that we could balance the load on incoming requests. That means that not the same Pod would handle all the requests but that different Pods would be hit.
We can easily try this out, now that we have scaled our app to contain 4 replicas of itself.

So far we used the describe command to describe the deployment but we can use it to describe the IP and port of. Once we have the IP and port we can then hit it with different HTTP requests.

kubectl describe services/kubernetes-first-app

Especially look at the NodePort and the Endpoints. NodePort is the port value that we want to hit with an HTTP request.

Now we will actually invoke the cURL command and ensure that it hits a different port each time and thereby prove our load balancing is working. Let's do the following:

NODE_PORT=30450

Next up the cURL call:

curl $(minikube ip):$NODE_PORT

As you can see above we are doing the call 4 times. Judging by the output and the name of the instance we see that we are hitting a different Pod for each request. Thereby we see that the load balancing is working.

Scaling down

So far we have scaled up. We managed to go from one Pod to 4 Pods thanks to the scale command. We can use the same command to scale down, like so:

kubectl scale deployments/kubernetes-first-app --replicas=2 

Now if we are really fast adding the next command we can see how the Pods are being removed as Kubernetes is trying to adjust to desired state.

2 out of 4 Pods are saying Terminating as only 2 Pods are needed to maintain the new desired state.

Running our command again we see that only 2 Pods remain and thereby our new desired state have been reached:

We can also look at our deployment to see that our scale instruction has been parsed correctly:

Self-healing

Self-healing is Kubernetes way of ensuring that the desired state is maintained. Pods don't self heal cause Pods can die. What happens is that a new Pod appears in its place, thanks to Kubernetes.

So how do we test this?

Glad you asked, we can delete a Pod and see what happens. So how do we do that? We use the delete command. We need to know the name of our Pod though so we need to call get pods for that. So let's start with that:

kubectl get pods

Then lets pick one of our two Pods kubernetes-first-app-669789f4f8-6glpx and assign it to a variable:

POD_NAME=kubernetes-first-app-669789f4f8-6glpx

Now remove it:

kubectl delete pods $POD_NAME

Let's be quick about it and check our Pod status with get pods. It should say Terminating like so:

Wait some time and then echo out our variable $POD_NAME followed by get pods. That should give you a result similar to the below.

So what does the above image tell us? It tells us that the Pod we deleted is truly deleted but it also tells us that the desired state of two replicas has been achieved by spinning up a new Pod. What we are seeing is * self-healing* at work.

Different ways to scale

Ok, we looked at a way to scale by explicitly saying how many replicas we want of a certain deployment. Sometimes, however, we might want a different way to scale namely auto-scaling. Auto-scaling is about you not having to set the exact number of replicas you want but rather rely on Kubernetes to create the number of replicas it thinks it needs. So how would Kubernetes know that? Well, it can look at more than one thing but a common metric is CPU utilization. So let's say you have a booking site and suddenly someone releases Bruce Springsteen tickets you are likely to want to rely on auto-scaling, cause the next day when the tickets are all sold out you want the number of Pods to go back to normal and you wouldn't want to do this manually.

Auto-scaling is a topic I plan to cover more in detail in a future article so if you are really curious how that is done I recommend you have a look here

Summary Part III

Ok. So we did it. We managed to scale an app by creating replicas of it. It wasn't so hard to accomplish. We showed how we only needed to provide Kubernetes with a desired state and it would do its utmost to preserve said state, also called * self-healing*. Furthermore, we mentioned that there was another way to scale, namely auto-scaling but decided to leave that topic for another article. Hopefully, you are now more in awe of how amazing Kubernetes is and how easy it is to scale your app.

Part IV - Auto scaling

In this article, we will cover the following:

  • Why auto scaling, we will discuss different scenarios in which it makes sense to rely on auto scaling over defining it statically like we do with desired state
  • How, lets talk about Horizontal Auto Scaling the concept/feature that allows us to scale in an elastic way.
  • Lab - lets scale, we will look at how to actually set this up in kubectl and simulate a ton of incoming requests. We will then inspect the results and see that Kubernetes acts the way we think
Why

So in our last part, we talked about desired state. That's an OK strategy until something unforeseen happens and suddenly you got a great influx of traffic. This is likely to happen to businesses such as e-commerce around a big sale or a ticket vendor when you release tickets to a popular event.

Events like these are an anomaly which forces you to quickly scale up. The other side of the coin though is that at some point you need to scale down or you suddenly have overcapacity you might need to pay for. What you really want is for the scaling to act in an elastic way so it scaled up when you need it to and scales down when there is less traffic.

How

Horizontal auto-scaling, what does it mean?

It's a concept in Kubernetes that can scale the number of Pods we need. It can do so on a replication controller, deployment or replica set. It usually looks at CPU utilization but can be made to look at other things by using something called custom metrics support, so it's customizable.

It consists of two parts a resource and a controller. The controller checks utilization, or whatever metric you decided, to ensure that the number of replicas matches your specification. If need be it spins up more Pods or removes them. The default is checking every 15 seconds but you can change that by looking at a flag called --horizontal-pod-autoscaler-sync-period.

The underlying algorithm that decides the number of replicas looks like this:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

Lab - lets scale

Ok, the first thing we need to do is to scale our deployment to use something other than desired state.

We have two things we need to specify when we do autoscaling:

  • min/max, we define a minimum and maximum in terms of how many Pods we want
  • CPU, in this version we set a certain CPU utilization percentage. When it goes above that it scales out as needed. Think of this one as an IF clause, if CPU value greater than the threshold, try to scale

Set up

Before we can attempt our scaling experiment we need to make sure we have the correct add-ons enabled. You can easily see what add-ons you have enabled by typing:

minikube addons list

If it looks like the above we are all good. Why am I saying that? Well, what we need, to be able to auto-scale, is that heapster and metrics-server add ons are enabled.

Heapster enables Container Cluster Monitoring and Performance Analysis.

Metrics server provide metrics via the resource metrics API. Horizontal Pod Autoscaler uses this API to collect metrics

We can easily enable them both with the following commands (we will need to for auto-scaling to show correct data):

minikube addons enable heapster

and

minikube addons enable metrics-server

We need to do one more thing, namely to enable Custom metrics, which we do by starting minikube with such a flag like so:

minikube start --extra-config kubelet.EnableCustomMetrics=true

Ok, now we are good to go.

Running the experiment

We need to do the following to run our experiment

  • Create a deployment
  • Apply autoscaling
  • Bombard the deployment with incoming requests
  • Watch the auto scaling how it changes

Create a deployment

kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80

Above we are creating a deployment php-apache and expose it as a service on port 80. We can see that we are using the image k8s.gcr.io/hpa-example

It should tell us the following:

service/php-apache created
deployment.apps/php-apache created

Autoscaling

Next up we will use the command autoscale. We will use it like so:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

It should say something like:

horizontalpodautoscaler.autoscaling/php-apache autoscaled

Above we are applying the auto-scaling on the deployment php-apache and as you can see we are applying both min-max and cpu based auto scaling which means we give a rule for how the auto scaling should happen:

If CPU load is >= 50% create a new Pod, but only maximum 10 Pods. If the load is low go back gradually to one Pod

Bombard with requests

Next step is to send a ton of requests against our deployment and see our auto-scaling doing its work. So how do we do that?

First off let's check the current status of our horizontal pod auto-scaler or hpa for short by typing:

kubectl get hpa

This should give us something like this:

The above shows us two pieces of information. The first is the TARGETS column which shows our CPU utilization, actual usage/trigger value. The next bit of interest is the column REPLICAS that shows us the number of copies, which is 1 at the moment.

For our next trick open up a separate terminal tab. We need to do is to set things up so we can send a ton of requests.

Next up we create a container using this command:

kubectl run -i --tty load-generator --image=busybox /bin/sh

This should take us to a prompt within the container. This is followed by:

while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done

The command above should result in something looking like this.

This will just go on and on until you hit CTRL+C, but leave it be for now.

This throws a ton on requests in while true loop.

I thought while true loops were bad?

They are but we are only going to run it for a minute so that the auto scaling can happen. Yes, the CPU will sound a lot but don't worry :)

Let this go on for a minute or so, then enter the following command into the first terminal tab (not the one running the requests), like so:

kubectl get hpa

It should now show something like this:

As you can see from the above the column TARGETS looks different and now says 339%/50% which means the current load on the CPU and REPLICAS is 7 which means it has gone from 1 to 7 replicas. So as you can see we have been bombarding it pretty hard.

Now go to the second terminal and hit CTRL+C or you will have a situation like this:

It will actually take a few minutes for Kubernetes to cool off and get the values back to normal. A first look at the Pods situation shows us the following:

kubectl get pods

As you can see we have 7 Pods up and running still but let's wait a minute or two and it should look like this:

Ok, now we are back to normal.

Summary Part IV

Now we did some great stuff in this article. We managed to set up auto-scaling, bombard it with requests and without frying our CPU, hopefully ;)

We also managed to learn some new Kubernetes commands while at it and got to see auto-scaling at work giving us new Pods based on our specification.

How to Set up an SMS Notification With Python

How to Set up an SMS Notification With Python

How to Set up an SMS Notification With Python. oday I am beginning a new series of posts specifically aimed at Python beginners.

Hi everyone :) Today I am beginning a new series of posts specifically aimed at Python beginners. The concept is rather simple: I'll do a fun project, in as few lines of code as possible, and will try out as many new tools as possible.

For example, today we will learn to use the Twilio API, the Twitch API, and we'll see how to deploy the project on Heroku. I'll show you how you can have your own "Twitch Live" SMS notifier, in 30 lines of codes, and for 12 cents a month.

Prerequisite: You only need to know how to run Python on your machine and some basic commands in git (commit & push). If you need help with these, I can recommend these 2 articles to you:

Python 3 Installation & Setup Guide

The Ultimate Git Command Tutorial for Beginners from Adrian Hajdin.

What you'll learn:

  • Twitch API
  • Twilio API
  • Deploying on Heroku
  • Setting up a scheduler on Heroku

What you will build:

The specifications are simple: we want to receive an SMS as soon as a specific Twitcher is live streaming. We want to know when this person is going live and when they leave streaming. We want this whole thing to run by itself, all day long.

We will split the project into 3 parts. First, we will see how to programmatically know if a particular Twitcher is online. Then we will see how to receive an SMS when this happens. We will finish by seeing how to make this piece of code run every X minutes, so we never miss another moment of our favorite streamer's life.

Is this Twitcher live?

To know if a Twitcher is live, we can do two things: we can go to the Twitcher URL and try to see if the badge "Live" is there.

Screenshot of a Twitcher live streaming.

This process involves scraping and is not easily doable in Python in less than 20 or so lines of code. Twitch runs a lot of JS code and a simple request.get() won't be enough.

For scraping to work, in this case, we would need to scrape this page inside Chrome to get the same content like what you see in the screenshot. This is doable, but it will take much more than 30 lines of code. If you'd like to learn more, don't hesitate to check my recent web scraping guide.

So instead of trying to scrape Twitch, we will use their API. For those unfamiliar with the term, an API is a programmatic interface that allows websites to expose their features and data to anyone, mainly developers. In Twitch's case, their API is exposed through HTTP, witch means that we can have lots of information and do lots of things by just making a simple HTTP request.

Get your API key

To do this, you have to first create a Twitch API key. Many services enforce authentication for their APIs to ensure that no one abuses them or to restrict access to certain features by certain people.

Please follow these steps to get your API key:

  • Create a Twitch account
  • Now create a Twitch dev account -> "Signing up with Twitch" top right
  • Go to your "dashboard" once logged in
  • "Register your application"
  • Name -> Whatever, Oauth redirection URL -> http://localhost, Category -> Whatever

You should now see, at the bottom of your screen, your client-id. Keep this for later.

Is that Twitcher streaming now?

With your API key in hand, we can now query the Twitch API to have the information we want, so let's begin to code. The following snippet just consumes the Twitch API with the correct parameters and prints the response.

# requests is the go to package in python to make http request
# https://2.python-requests.org/en/master/
import requests

# This is one of the route where Twich expose data, 
# They have many more: https://dev.twitch.tv/docs
endpoint = "https://api.twitch.tv/helix/streams?"

# In order to authenticate we need to pass our api key through header
headers = {"Client-ID": "<YOUR-CLIENT-ID>"}

# The previously set endpoint needs some parameter, here, the Twitcher we want to follow
# Disclaimer, I don't even know who this is, but he was the first one on Twich to have a live stream so I could have nice examples
params = {"user_login": "Solary"}

# It is now time to make the actual request
response = request.get(endpoint, params=params, headers=headers)
print(response.json())

The output should look like this:

{
   'data':[
      {
         'id':'35289543872',
         'user_id':'174955366',
         'user_name':'Solary',
         'game_id':'21779',
         'type':'live',
         'title':"Wakz duoQ w/ Tioo - GM 400LP - On récupère le chall après les -250LP d'inactivité !",
         'viewer_count':4073,
         'started_at':'2019-08-14T07:01:59Z',
         'language':'fr',
         'thumbnail_url':'https://static-cdn.jtvnw.net/previews-ttv/live_user_solary-{width}x{height}.jpg',
         'tag_ids':[
            '6f655045-9989-4ef7-8f85-1edcec42d648'
         ]
      }
   ],
   'pagination':{
      'cursor':'eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6MX19'
   }
}

This data format is called JSON and is easily readable. The data object is an array that contains all the currently active streams. The key type ensures that the stream is currently live. This key will be empty otherwise (in case of an error, for example).

So if we want to create a boolean variable in Python that stores whether the current user is streaming, all we have to append to our code is:

json_response = response.json()

# We get only streams
streams = json_response.get('data', [])

# We create a small function, (a lambda), that tests if a stream is live or not
is_active = lambda stream: stream.get('type') == 'live'
# We filter our array of streams with this function so we only keep streams that are active
streams_active = filter(is_active, streams)

# any returns True if streams_active has at least one element, else False
at_least_one_stream_active = any(streams_active)

print(at_least_one_stream_active)

At this point, at_least_one_stream_active is True when your favourite Twitcher is live.

Let's now see how to get notified by SMS.

Send me a text, NOW!

So to send a text to ourselves, we will use the Twilio API. Just go over there and create an account. When asked to confirm your phone number, please use the phone number you want to use in this project. This way you'll be able to use the $15 of free credit Twilio offers to new users. At around 1 cent a text, it should be enough for your bot to run for one year.

If you go on the console, you'll see your Account SID and your Auth Token , save them for later. Also click on the big red button "Get My Trial Number", follow the step, and save this one for later too.

Sending a text with the Twilio Python API is very easy, as they provide a package that does the annoying stuff for you. Install the package with pip install Twilio and just do:

from twilio.rest import Client
client = Client(<Your Account SID>, <Your Auth Token>)
client.messages.create(
	body='Test MSG',from_=<Your Trial Number>,to=<Your Real Number>)

And that is all you need to send yourself a text, amazing right?

Putting everything together

We will now put everything together, and shorten the code a bit so we manage to say under 30 lines of Python code.

import requests
from twilio.rest import Client
endpoint = "https://api.twitch.tv/helix/streams?"
headers = {"Client-ID": "<YOUR-CLIENT-ID>"}
params = {"user_login": "Solary"}
response = request.get(endpoint, params=params, headers=headers)
json_response = response.json()
streams = json_response.get('data', [])
is_active = lambda stream:stream.get('type') == 'live'
streams_active = filter(is_active, streams)
at_least_one_stream_active = any(streams_active)
if at_least_one_stream_active:
    client = Client(<Your Account SID>, <Your Auth Token>)
	client.messages.create(body='LIVE !!!',from_=<Your Trial Number>,to=<Your Real Number>)

Avoiding double notifications

This snippet works great, but should that snippet run every minute on a server, as soon as our favorite Twitcher goes live we will receive an SMS every minute.

We need a way to store the fact that we were already notified that our Twitcher is live and that we don't need to be notified anymore.

The good thing with the Twilio API is that it offers a way to retrieve our message history, so we just have to retrieve the last SMS we sent to see if we already sent a text notifying us that the twitcher is live.

Here what we are going do to in pseudocode:

if favorite_twitcher_live and last_sent_sms is not live_notification:
	send_live_notification()
if not favorite_twitcher_live and last_sent_sms is live_notification:
	send_live_is_over_notification()

This way we will receive a text as soon as the stream starts, as well as when it is over. This way we won't get spammed - perfect right? Let's code it:

# reusing our Twilio client
last_messages_sent = client.messages.list(limit=1)
last_message_id = last_messages_sent[0].sid
last_message_data = client.messages(last_message_id).fetch()
last_message_content = last_message_data.body

Let's now put everything together again:

import requests
from twilio.rest import Client
client = Client(<Your Account SID>, <Your Auth Token>)

endpoint = "https://api.twitch.tv/helix/streams?"
headers = {"Client-ID": "<YOUR-CLIENT-ID>"}
params = {"user_login": "Solary"}
response = request.get(endpoint, params=params, headers=headers)
json_response = response.json()
streams = json_response.get('data', [])
is_active = lambda stream:stream.get('type') == 'live'
streams_active = filter(is_active, streams)
at_least_one_stream_active = any(streams_active)

last_messages_sent = client.messages.list(limit=1)
if last_messages_sent:
	last_message_id = last_messages_sent[0].sid
	last_message_data = client.messages(last_message_id).fetch()
	last_message_content = last_message_data.body
    online_notified = "LIVE" in last_message_content
    offline_notified = not online_notified
else:
	online_notified, offline_notified = False, False

if at_least_one_stream_active and not online_notified:
	client.messages.create(body='LIVE !!!',from_=<Your Trial Number>,to=<Your Real Number>)
if not at_least_one_stream_active and not offline_notified:
	client.messages.create(body='OFFLINE !!!',from_=<Your Trial Number>,to=<Your Real Number>)

And voilà!

You now have a snippet of code, in less than 30 lines of Python, that will send you a text a soon as your favourite Twitcher goes Online / Offline and without spamming you.

We just now need a way to host and run this snippet every X minutes.

The quest for a host

To host and run this snippet we will use Heroku. Heroku is honestly one of the easiest ways to host an app on the web. The downside is that it is really expensive compared to other solutions out there. Fortunately for us, they have a generous free plan that will allow us to do what we want for almost nothing.

If you don't already, you need to create a Heroku account. You also need to download and install the Heroku client.

You now have to move your Python script to its own folder, don't forget to add a requirements.txt file in it. The content of the latter begins:

requests
twilio

This is to ensure that Heroku downloads the correct dependencies.

cd into this folder and just do a heroku create --app <app name>.

If you go on your app dashboard you'll see your new app.

We now need to initialize a git repo and push the code on Heroku:

git init
heroku git:remote -a <app name>
git add .
git commit -am 'Deploy breakthrough script'
git push heroku master

Your app is now on Heroku, but it is not doing anything. Since this little script can't accept HTTP requests, going to <app name>.herokuapp.com won't do anything. But that should not be a problem.

To have this script running 24/7 we need to use a simple Heroku add-on call "Heroku Scheduler". To install this add-on, click on the "Configure Add-ons" button on your app dashboard.

Then, on the search bar, look for Heroku Scheduler:

Click on the result, and click on "Provision"

If you go back to your App dashboard, you'll see the add-on:

Click on the "Heroku Scheduler" link to configure a job. Then click on "Create Job". Here select "10 minutes", and for run command select python <name_of_your_script>.py. Click on "Save job".

While everything we used so far on Heroku is free, the Heroku Scheduler will run the job on the $25/month instance, but prorated to the second. Since this script approximately takes 3 seconds to run, for this script to run every 10 minutes you should just have to spend 12 cents a month.

Ideas for improvements

I hope you liked this project and that you had fun putting it into place. In less than 30 lines of code, we did a lot, but this whole thing is far from perfect. Here are a few ideas to improve it:

  • Send yourself more information about the current streaming (game played, number of viewers ...)
  • Send yourself the duration of the last stream once the twitcher goes offline
  • Don't send you a text, but rather an email
  • Monitor multiple twitchers at the same time

Do not hesitate to tell me in the comments if you have more ideas.

Conclusion

I hope that you liked this post and that you learned things reading it. I truly believe that this kind of project is one of the best ways to learn new tools and concepts, I recently launched a web scraping API where I learned a lot while making it.

Please tell me in the comments if you liked this format and if you want to do more.

I have many other ideas, and I hope you will like them. Do not hesitate to share what other things you build with this snippet, possibilities are endless.

Happy Coding.

Pierre

Don't want to miss my next post:

You can subscribe here to my newsletter.

Python Tutorial: Image processing with Python (Using OpenCV)

Python Tutorial: Image processing with Python (Using OpenCV)

In this tutorial, you will learn how you can process images in Python using the OpenCV library.

In this tutorial, you will learn how you can process images in Python using the OpenCV library.

OpenCV is a free open source library used in real-time image processing. It’s used to process images, videos, and even live streams, but in this tutorial, we will process images only as a first step. Before getting started, let’s install OpenCV.

Table of Contents

Install OpenCV

To install OpenCV on your system, run the following pip command:

 pip install opencv-python

Now OpenCV is installed successfully and we are ready. Let’s have some fun with some images!

Rotate an Image

First of all, import the cv2 module.

 import cv2

Now to read the image, use the imread() method of the cv2 module, specify the path to the image in the arguments and store the image in a variable as below:

 img = cv2.imread("pyimg.jpg")

The image is now treated as a matrix with rows and columns values stored in img.

Actually, if you check the type of the img, it will give you the following result:

>>>print(type(img))
 
<class 'numpy.ndarray'>

It’s a NumPy array! That why image processing using OpenCV is so easy. All the time you are working with a NumPy array.

To display the image, you can use the imshow() method of cv2.

cv2.imshow('Original Image', img) 
 
cv2.waitKey(0)

The waitkey functions take time as an argument in milliseconds as a delay for the window to close. Here we set the time to zero to show the window forever until we close it manually.

To rotate this image, you need the width and the height of the image because you will use them in the rotation process as you will see later.

 height, width = img.shape[0:2]

The shape attribute returns the height and width of the image matrix. If you print img.shape[0:2] , you will have the following output:

Okay, now we have our image matrix and we want to get the rotation matrix. To get the rotation matrix, we use the getRotationMatrix2D() method of cv2. The syntax of getRotationMatrix2D() is:

 cv2.getRotationMatrix2D(center, angle, scale)

Here the center is the center point of rotation, the angle is the angle in degrees and scale is the scale property which makes the image fit on the screen.

To get the rotation matrix of our image, the code will be:

 rotationMatrix = cv2.getRotationMatrix2D((width/2, height/2), 90, .5)

The next step is to rotate our image with the help of the rotation matrix.

To rotate the image, we have a cv2 method named wrapAffine which takes the original image, the rotation matrix of the image and the width and height of the image as arguments.

 rotatedImage = cv2.warpAffine(img, rotationMatrix, (width, height))

The rotated image is stored in the rotatedImage matrix. To show the image, use imshow() as below:

cv2.imshow('Rotated Image', rotatedImage)
 
cv2.waitKey(0)

After running the above lines of code, you will have the following output:

Crop an Image

First, we need to import the cv2 module and read the image and extract the width and height of the image:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
height, width = img.shape[0:2]

Now get the starting and ending index of the row and column. This will define the size of the newly created image. For example, start from row number 10 till row number 15 will give the height of the image.

Similarly, start from column number 10 until column number 15 will give the width of the image.

You can get the starting point by specifying the percentage value of the total height and the total width. Similarly, to get the ending point of the cropped image, specify the percentage values as below:

startRow = int(height*.15)
 
startCol = int(width*.15)
 
endRow = int(height*.85)
 
endCol = int(width*.85)

Now map these values to the original image. Note that you have to cast the starting and ending values to integers because when mapping, the indexes are always integers.

 croppedImage = img[startRow:endRow, startCol:endCol]

Here we specified the range from starting to ending of rows and columns.

Now display the original and cropped image in the output:

cv2.imshow('Original Image', img)
 
cv2.imshow('Cropped Image', croppedImage)
 
cv2.waitKey(0)

The result will be as follows:

Resize an Image

To resize an image, you can use the resize() method of openCV. In the resize method, you can either specify the values of x and y axis or the number of rows and columns which tells the size of the image.

Import and read the image:

import cv2
 
img = cv2.imread("pyimg.jpg")

Now using the resize method with axis values:

newImg = cv2.resize(img, (0,0), fx=0.75, fy=0.75)
 
cv2.imshow('Resized Image', newImg)
 
cv2.waitKey(0)

The result will be as follows:

Now using the row and column values to resize the image:

newImg = cv2.resize(img, (550, 350))
 
cv2.imshow('Resized Image', newImg)
 
cv2.waitKey(0)

We say we want 550 columns (the width) and 350 rows (the height).

The result will be:

Adjust Image Contrast

In Python OpenCV module, there is no particular function to adjust image contrast but the official documentation of OpenCV suggests an equation that can perform image brightness and image contrast both at the same time.

 new_img = a * original_img + b

Here a is alpha which defines contrast of the image. If a is greater than 1, there will be higher contrast.

If the value of a is between 0 and 1 (smaller than 1 but greater than 0), there would be lower contrast. If a is 1, there will be no contrast effect on the image.

b stands for beta. The values of b vary from -127 to +127.

To implement this equation in Python OpenCV, you can use the addWeighted() method. We use The addWeighted() method as it generates the output in the range of 0 and 255 for a 24-bit color image.

The syntax of addWeighted() method is as follows:

 cv2.addWeighted(source_img1, alpha1, source_img2, alpha2, beta)

This syntax will blend two images, the first source image (source_img1) with a weight of alpha1 and second source image (source_img2).

If you only want to apply contrast in one image, you can add a second image source as zeros using NumPy.

Let’s work on a simple example. Import the following modules:

import cv2
 
import numpy as np

Read the original image:

 img = cv2.imread("pyimg.jpg")

Now apply the contrast. Since there is no other image, we will use the np.zeros which will create an array of the same shape and data type as the original image but the array will be filled with zeros.

contrast_img = cv2.addWeighted(img, 2.5, np.zeros(img.shape, img.dtype), 0, 0)
 
cv2.imshow('Original Image', img)
 
cv2.imshow('Contrast Image', contrast_img)
 
cv2.waitKey(0)

In the above code, the brightness is set to 0 as we only want to apply contrast.

The comparison of the original and contrast image is as follows:

Make an image blurry

Gaussian Blur

To make an image blurry, you can use the GaussianBlur() method of OpenCV.

The GaussianBlur() uses the Gaussian kernel. The height and width of the kernel should be a positive and an odd number.

Then you have to specify the X and Y direction that is sigmaX and sigmaY respectively. If only one is specified, both are considered the same.

Consider the following example:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
blur_image = cv2.GaussianBlur(img, (7,7), 0)
 
cv2.imshow('Original Image', img)
 
cv2.imshow('Blur Image', blur_image)
 
cv2.waitKey(0)

In the above snippet, the actual image is passed to GaussianBlur() along with height and width of the kernel and the X and Y directions.

The comparison of the original and blurry image is as follows:

Median Blur

In median blurring, the median of all the pixels of the image is calculated inside the kernel area. The central value is then replaced with the resultant median value. Median blurring is used when there are salt and pepper noise in the image.

To apply median blurring, you can use the medianBlur() method of OpenCV.

Consider the following example where we have a salt and pepper noise in the image:

import cv2
 
img = cv2.imread("pynoise.png")
 
blur_image = cv2.medianBlur(img,5)

This will apply 50% noise in the image along with median blur. Now show the images:

cv2.imshow('Original Image', img)
 
cv2.imshow('Blur Image', blur_image)
 
cv2.waitKey(0)

The result will be like the following:

Another comparison of the original image and after blurring:

Detect Edges

To detect the edges in an image, you can use the Canny() method of cv2 which implements the Canny edge detector. The Canny edge detector is also known as the optimal detector.

The syntax to Canny() is as follows:

 cv2.Canny(image, minVal, maxVal)

Here minVal and maxVal are the minimum and maximum intensity gradient values respectively.

Consider the following code:

import cv2
 
img = cv2.imread("pyimg.jpg")
 
edge_img = cv2.Canny(img,100,200)
 
cv2.imshow("Detected Edges", edge_img)
 
cv2.waitKey(0)

The output will be the following:

Here is the result of the above code on another image:

Convert image to grayscale (Black & White)

The easy way to convert an image in grayscale is to load it like this:

 img = cv2.imread("pyimg.jpg", 0)

There is another method using BGR2GRAY.

To convert a color image into a grayscale image, use the BGR2GRAY attribute of the cv2 module. This is demonstrated in the example below:

Import the cv2 module:

 import cv2

Read the image:

 img = cv2.imread("pyimg.jpg")

Use the cvtColor() method of the cv2 module which takes the original image and the COLOR_BGR2GRAY attribute as an argument. Store the resultant image in a variable:

 gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Display the original and grayscale images:

cv2.imshow("Original Image", img)
 
cv2.imshow("Gray Scale Image", gray_img)
 
cv2.waitKey(0)

The output will be as follows:

Centroid (Center of blob) detection

To find the center of an image, the first step is to convert the original image into grayscale. We can use the cvtColor() method of cv2 as we did before.

This is demonstrated in the following code:

import cv2
 
img = cv2.imread("py.jpg")
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

We read the image and convert it to a grayscale image. The new image is stored in gray_img.

Now we have to calculate the moments of the image. Use the moments() method of cv2. In the moments() method, the grayscale image will be passed as below:

 moment = cv2.moments(gray_img)

Finally, we have the center of the image. To highlight this center position, we can use the circle method which will create a circle in the given coordinates of the given radius.

The circle() method takes the img, the x and y coordinates where the circle will be created, the size, the color that we want the circle to be and the thickness.

 cv2.circle(img, (X, Y), 15, (205, 114, 101), 1)

The circle is created on the image.

cv2.imshow("Center of the Image", img)
 
cv2.waitKey(0)

The original image is:

After detecting the center, our image will be as follows:

Apply a mask for a colored image

Image masking means to apply some other image as a mask on the original image or to change the pixel values in the image.

To apply a mask on the image, we will use the HoughCircles() method of the OpenCV module. The HoughCircles() method detects the circles in an image. After detecting the circles, we can simply apply a mask on these circles.

The HoughCircles() method takes the original image, the Hough Gradient (which detects the gradient information in the edges of the circle), and the information from the following circle equation:

 (x - xcenter)2 + (y - ycenter)2 = r2

In this equation (xcenter , ycenter) is the center of the circle and r is the radius of the circle.

Our original image is:

After detecting circles in the image, the result will be:

Okay, so we have the circles in the image and we can apply the mask. Consider the following code:

import cv2
 
import numpy as np
 
img1 = cv2.imread('pyimg.jpg')
 
img1 = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

Detecting the circles in the image using the HoughCircles() code from OpenCV: Hough Circle Transform:

gray_img = cv2.medianBlur(cv2.cvtColor(img, cv2.COLOR_RGB2GRAY), 3)
 
circles = cv2.HoughCircles(gray_img, cv2.HOUGH_GRADIENT, 1, 20, param1=50, param2=50, minRadius=0, maxRadius=0)
 
circles = np.uint16(np.around(circles))

To create the mask, use np.full which will return a NumPy array of given shape:

masking=np.full((img1.shape[0], img1.shape[1]),0,dtype=np.uint8)
 
for j in circles[0, :]:
 
    cv2.circle(masking, (j[0], j[1]), j[2], (255, 255, 255), -1)

The next step is to combine the image and the masking array we created using the bitwise_or operator as follows:

 final_img = cv2.bitwise_or(img1, img1, masking=masking)

Display the resultant image:

Extracting text from Image (OCR)

To extract text from an image, you can use Google Tesseract-OCR. You can download it from this link

Then you should install the pytesseract module which is a Python wrapper for Tesseract-OCR.

The image from which we will extract the text from is as follows:

Now let’s convert the text in this image to a string of characters and display the text as a string on output:

Import the pytesseract module:

 import pytesseract

Set the path of the Tesseract-OCR executable file:

 pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract'

Now use the image_to_string method to convert the image into a string:

 print(pytesseract.image_to_string('pytext.png'))

The output will be as follows:

Works like charm!

Detect and correct text skew

In this section, we will correct the text skew.

The original image is as follows:

Import the modules cv2, NumPy and read the image:

import cv2
 
import numpy as np
 
img = cv2.imread("pytext1.png")

Convert the image into a grayscale image:

 gray_img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Invert the grayscale image using bitwise_not:

 gray_img=cv2.bitwise_not(gray_img)

Select the x and y coordinates of the pixels greater than zero by using the column_stack method of NumPy:

 coordinates = np.column_stack(np.where(gray_img > 0))

Now we have to calculate the skew angle. We will use the minAreaRect() method of cv2 which returns an angle range from -90 to 0 degrees (where 0 is not included).

 ang=cv2.minAreaRect(coordinates)[-1]

The rotated angle of the text region will be stored in the ang variable. Now we add a condition for the angle; if the text region’s angle is smaller than -45, we will add a 90 degrees else we will multiply the angle with a minus to make the angle positive.

if ang<-45:
 
    ang=-(90+ang)
 
else:
 
    ang=-ang

Calculate the center of the text region:

height, width = img.shape[:2]
 
center_img = (width / 2, height / 2)

Now we have the angle of text skew, we will apply the getRotationMatrix2D() to get the rotation matrix then we will use the wrapAffine() method to rotate the angle (explained earlier).

rotationMatrix = cv2.getRotationMatrix2D(center, angle, 1.0)
 
rotated_img = cv2.warpAffine(img, rotationMatrix, (width, height), borderMode = cv2.BORDER_REFLECT)

Display the rotated image:

cv2.imshow("Rotated Image", rotated_img)
 
cv2.waitKey(0)

Color Detection

Let’s detect the green color from an image:

Import the modules cv2 for images and NumPy for image arrays:

import cv2
 
import numpy as np

Read the image and convert it into HSV using cvtColor():

img = cv2.imread("pydetect.png")
 
hsv_img = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

Display the image:

 cv2.imshow("HSV Image", hsv_img)

Now create a NumPy array for the lower green values and the upper green values:

lower_green = np.array([34, 177, 76])
 
upper_green = np.array([255, 255, 255])

Use the inRange() method of cv2 to check if the given image array elements lie between array values of upper and lower boundaries:

 masking = cv2.inRange(hsv_img, lower_green, upper_green)

This will detect the green color.

Finally, display the original and resultant images:

 cv2.imshow("Original Image", img)

cv2.imshow("Green Color detection", masking)
 
cv2.waitKey(0)

Reduce Noise

To reduce noise from an image, OpenCV provides the following methods:

  1. fastNlMeansDenoising(): Removes noise from a grayscale image
  2. fastNlMeansDenoisingColored(): Removes noise from a colored image
  3. fastNlMeansDenoisingMulti(): Removes noise from grayscale image frames (a grayscale video)
  4. fastNlMeansDenoisingColoredMulti(): Same as 3 but works with colored frames

Let’s use fastNlMeansDenoisingColored() in our example:

Import the cv2 module and read the image:

2
3
	
import cv2
 
img = cv2.imread("pyn1.png")

Apply the denoising function which takes respectively the original image (src), the destination (which we have kept none as we are storing the resultant), the filter strength, the image value to remove the colored noise (usually equal to filter strength or 10), the template patch size in pixel to compute weights which should always be odd (recommended size equals 7) and the window size in pixels to compute average of the given pixel.

 result = cv2.fastNlMeansDenoisingColored(img,None,20,10,7,21)

Display original and denoised image:

cv2.imshow("Original Image", img)
 
cv2.imshow("Denoised Image", result)
 
cv2.waitKey(0)

The output will be:

Get image contour

Contours are the curves in an image that are joint together. The curves join the continuous points in an image. The purpose of contours is used to detect the objects.

The original image of which we are getting the contours of is given below:

Consider the following code where we used the findContours() method to find the contours in the image:

Import cv2 module:

 import cv2

Read the image and convert it to a grayscale image:

img = cv2.imread('py1.jpg')
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Find the threshold:

 retval, thresh = cv2.threshold(gray_img, 127, 255, 0)

Use the findContours() which takes the image (we passed threshold here) and some attributes. See findContours() Official.

 img_contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

Draw the contours on the image using drawContours() method:

  cv2.drawContours(img, img_contours, -1, (0, 255, 0))

Display the image:

cv2.imshow('Image Contours', img)
 
cv2.waitKey(0)

The result will be:

Remove Background from an image

To remove the background from an image, we will find the contours to detect edges of the main object and create a mask with np.zeros for the background and then combine the mask and the image using the bitwise_and operator.

Consider the example below:

Import the modules (NumPy and cv2):

import cv2
 
import numpy as np

Read the image and convert the image into a grayscale image:

img = cv2.imread("py.jpg")
 
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Find the threshold:

 _, thresh = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

In the threshold() method, the last argument defines the style of the threshold. See Official documentation of OpenCV threshold.

Find the image contours:

 img_contours = cv2.findContours(threshed, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)[-2]

Sort the contours:

img_contours = sorted(img_contours, key=cv2.contourArea)
 
for i in img_contours:
 
    if cv2.contourArea(i) > 100:
 
        break

Generate the mask using np.zeros:

 mask = np.zeros(img.shape[:2], np.uint8)

Draw contours:

 cv2.drawContours(mask, [i],-1, 255, -1)

Apply the bitwise_and operator:

 new_img = cv2.bitwise_and(img, img, mask=mask)

Display the original image:

 cv2.imshow("Original Image", img)

Display the resultant image:

cv2.imshow("Image with background removed", new_img)
 
cv2.waitKey(0)

Image processing is fun when using OpenCV as you saw. I hope you find the tutorial useful. Keep coming back.

Thank you.

Introduction to Python Hex() Function for Beginners

Introduction to Python Hex() Function for Beginners

Python hex() function is used to convert any integer number ( in base 10) to the corresponding hexadecimal number. Notably, the given input should be in base 10. Python hex function is one of the built-in functions in Python3, which is used to convert an integer number into its corresponding hexadecimal form.

Python hex() function is used to convert any integer number ( in base 10) to the corresponding hexadecimal number. Notably, the given input should be in base 10. Python hex function is one of the built-in functions in Python3, which is used to convert an integer number into its corresponding hexadecimal form.

##Python Hex Example
The hex() function converts the integer to the corresponding hexadecimal number in string form and returns it.

The input integer argument can be in any base such as binary, octal, etc. Python will take care of converting them to hexadecimal format.

Syntax

hex(number)

number: It is an integer that will be converted into hexadecimal value.
This function converts the number into the hexadecimal form, and then it returns that hexadecimal number in a string format.

Please note that the return value always starts with ‘0x’ (without quotes), which proves that the number is in hexadecimal format.

# app.py

print("Enter the number: ")

# taking input from user
num = int(input())

# converting the number into hexadecimal form
h1 = hex(num)

# Printing hexadecimal form
print("The ", num, " in hexadecimal is: ", h1)

# Converting float number to hexadecimal form
print("\nEnter a float number")
num2 = float(input())

# converting into hexadecimal form
# for float we have to use float.hex() here
h2 = float.hex(num2)

# printing result
print("The ", num2, " in hexadecimal is: ", h2)

In the above example, we used the Python input() function to take the input from the user.

See the output.

Enter the number:
541
The  541  in hexadecimal is:  0x21d
    
Enter a float number
123.54
The  123.54  in hexadecimal is:  0x1.ee28f5c28f5c3p+6

Python hex() without 0x

See the following program.

# app.py

print("Enter the number: ")

# taking input from user
num = int(input())

# converting the number into hexadecimal form
h1 = hex(num)

# Printing hexadecimal form
# we have used string slicing here
print("The ", num, " in hexadecimal is: ", h1[2:])

# Converting float number to hexadecimal form
print("\nEnter a float number")
num2 = float(input())

# converting into hexadecimal form
h2 = float.hex(num2)

# printing result
print("The ", num2, " in hexadecimal is: ", h2[2:])

See the output.

Enter the number:
541
The  541  in hexadecimal is:  21d

Enter a float number
123.65
The  123.65  in hexadecimal is:  1.ee9999999999ap+6

On the above program, we have used string slicing to print the result without ‘0x’.

We have started our index from position 2 to the last of the string, i.e., h1[2:]; this means the string will print characters from position 2 to the last of the string.

Hexadecimal representation of float in Python

See the following program.

# app.py

numberEL = 11.21
print(numberEL, 'in hex =', float.hex(numberEL))

numberK = 19.21
print(numberK, 'in hex =', float.hex(numberK))

See the output.

➜  pyt python3 app.py
11.21 in hex = 0x1.66b851eb851ecp+3
19.21 in hex = 0x1.335c28f5c28f6p+4
➜  pyt

Python hex() with object

See the following code.

# app.py

class AI:
    id = 0

    def __index__(self):
        print('__index__() function called')
        return self.rank


stockfish = AI()
stockfish.rank = 2900

print(hex(stockfish))

In the above example, we have used the index() method so that we can use it with hex() function.

See the output.

➜  pyt python3 app.py
__index__() function called
0xb54
➜  pyt

How to convert hex string to int in Python

Without the 0x prefix, you need to specify the base explicitly. Otherwise, it won’t work.

See the following code.

# app.py

data = int("0xa", 16)
print(data)

With the 0x prefix, Python can distinguish hex and decimal automatically.

You must specify 0 as the base to invoke this prefix-guessing behavior, omitting the second parameter means to assume base-10.)
If you want to convert the string to an int, pass the string to int along with a base you are converting from. Both strings will suffice for conversion in this way.

# app.py

hexStrA = "0xffff"
hexStrB = "ffff"

print(int(hexStrA, 16))
print(int(hexStrB, 16))

See the output.

➜  pyt python3 app.py
65535
65535
➜  pyt

In the all above examples, we have used Python int() method.

Thanks for reading !

Top 10 Books To Learn Python

Top 10 Books To Learn Python

This video on 'Top 10 Books To Learn Python' will suggest to you what we think are the best books for Python, even if you are an experienced programmer or a complete beginner.

This video on 'Top 10 Books To Learn Python' will suggest to you what we think are the best books for Python, even if you are an experienced programmer or a complete beginner. Below are the topics covered in this video:

Why Python?

  • Beginner-Level Python Books
  • Domain-Specific Python Books
  • Bonus Python Book

Links for the Python Books:

  1. Learning Python by Mark Lutz: http://bit.ly/2BR38aY
  2. Python Crash Course by Eric Matthews: http://bit.ly/2BLlJ8i
  3. Think Python by Allen Downey: http://bit.ly/2pjoXNC
  4. Python Programming by John M Zelle: http://bit.ly/31SkYon
  5. Python in a Nutshell by Alex Martelli: http://bit.ly/32UOyeh
  6. Programming Python Mark Lutz: https://amzn.to/31Slhj1
  7. Effective Computation in Physics by Anthony Scopatz, Kathryn D. Huff: http://bit.ly/2BPD00c
  8. Python for Data Analysis by Wes McKinney: http://bit.ly/2pWCaMo
  9. Python Machine Learning by Sebastian Raschka and Vahid Mirjalili: https://amzn.to/36amOV3
  10. Django for Beginners by William S. Vincent: https://amzn.to/36lQtuG