Introducing Policy As Code: The Open Policy Agent (OPA)

What Is OPA?

It’s a project that started in 2016 aimed at unifying policy enforcement across different technologies and systems. Today, OPA is used by giant players within the tech industry. For example, Netflix uses OPA to control access to its internal API resources. Chef uses it to provide IAM capabilities in their end-user products. In addition, many other companies like Cloudflare, Pinterest, and others use OPA to enforce policies on their platforms (like Kubernetes clusters). Currently, OPA is part of CNCF as an incubating project.

What Does OPA Bring To The Table?

You may be wondering: How did OPA come about? What problems does it try to solve? Indeed, policy enforcement over APIs and microservices is as old as microservices themselves. There’s never been a production-grade application that didn’t enforce access control, authorization, and policy enforcement of some kind. To understand the role of OPA, consider the following use case: your company sells laptops through an online portal. Like all other similar applications, the portal consists of a front-page where clients see the latest offerings, perhaps some limited-time promotions. If customers want to buy something, they need to log in or create an account. Next, they issue payments through their credit cards or other methods. To make sure your clients repeatedly visit, you offer that they sign up for your newsletter, which may contain special discounts. Also, they may opt to receive browser notifications as soon as new products are announced. A very typical online shopping app, right? Now, let’s depict what that workflow would look like in a diagram to visualize the process:

Introducing Policy as Code- the Open Policy Agent (OPA)

The diagram above shows how our system might look internally. We have a number of microservices that communicate with each other to serve our customers. Now, obviously, Bob shouldn’t see any of the internal workings of the system. For example, he can’t view (or even know about) the S3 bucket where payments get archived, or which services the notification API can talk to. But, what about John? He’s one of our application developers and he needs to have access to all the microservices to be able to troubleshoot and debug when issues occur. Or, does he? What if he accidentally (or intentionally) made an API call to the database service to change the delivery address of the customer to somewhere else? Even worse, what if he had read permissions to the customers’ credit card numbers? To address those risks, we place an authorization control on top of each of our microservices. The control checks whether or not the authenticated user has the required privileges to perform the requested operation. Such an authorization system may be an internal, home-grown process or external as provided by AWS IAM. That’s how a typical microservices application is built and secured. But look at the drawbacks of using several assorted authorization systems especially as the application grows:

Modifying existing policies, or introducing new ones, is a nightmare. Just think of how many places you’ll need to visit to give Alice read access to all storage-related systems. This means S3, MySQL, MongoDB, and perhaps an external API to name a few.
There’s no way for developers to enforce policies on their own systems. They can obviously hardcode their authorization logic in the application, but that makes things intricately worse: trying to unify policies among different microservices is highly complicated.
Adding to the previous point, introducing a new policy to local services may require changing the code and, thus, introduce new versions of all the microservices.
What if you want to integrate policies with an existing user database? For example, integrating with the HR database.
We’ll need to visualize the policy to ensure that it’s doing what it’s supposed to do. This becomes of increasing importance as your policies get more complex.
Modern systems comprise multiple technologies and services which are written in different languages. For example, you may have the core of your system running on Kubernetes, and a bunch of legacy APIs that are not part of the cluster written in Java, Ruby, and PHP. Each platform has its own authorization mechanism.

Let’s look at Kubernetes as an example. If all users were authorized access to the entire cluster, lots of nasty things can happen such as:

Giving unlimited requests and limits to all the pods may cause random pods to get evicted from the nodes.
Pulling and using untested, haphazard images that may contain security vulnerabilities or malicious content.
Using Ingress controllers without TLS, allowing unencrypted, unsecured traffic to the application.
Numerous other unforeseen risks due to the overall complexity.

You can definitely use RBAC and Pod security policies to impose fine-grained control over the cluster. But again, this will only apply to the cluster. Kubernetes RBAC is of no use except in a Kubernetes cluster.

That’s where Open Policy Agent (OPA) comes into play. OPA was introduced to create a unified method of enforcing security policy in the stack.

How Does OPA Work?

Earlier, we explored the policy-enforcement strategies and what OPA tries to solve - that showed us the “what” part. Now, let’s now take a look at the “how.”

Let’s say that you’re implementing the Payments service of our example application. This service is responsible for handling customer payments. It exposes an API where it accepts payment from the customer. It also allows the user to query which payments were made by a specific customer. So, to obtain an array containing the purchases done by Jane, who is one of the company’s customers, you send a GET request to the API with the path /payment/jane. You provide your credential information in the Authorization header and send the request. The response would be a JSON array with the data you requested. However, since you don’t want just anyone with network access to have access to the Payments API to see such sensitive data, you need to enforce an authorization policy. OPA addresses the issue in the following way:

The Payments API queries OPA for a decision. It accompanies this query with some attributes like the HTTP method used in the request, the path, the user, and so on.
OPA validates those attributes against data already provided to it.
After validation, OPA sends a decision to the requesting API with either allow or deny.

The important thing to notice here is that OPA decouples our policy decision from policy enforcement. The OPA workflow can be depicted in the following diagram:

Introducing Policy as Code- the Open Policy Agent (OPA) 2

OPA is a general-purpose, domain-agnostic policy enforcement tool. It can be integrated with APIs, the Linux SSH daemon, an object store like CEPH, etc. OPA designers purposefully avoided basing it on any other project. Accordingly, the policy query and decision do not follow a specific format. That is, you can use any valid JSON data as request attributes as long as it provides the required data. Similarly, the policy decision coming from OPA can also be any valid JSON data. You choose what gets input and what gets output. For example, you can opt to have OPA return a True or False JSON object, a number, a string, or even a complex data object.

#kubernetes #microservices #devops

What Is OPA?

What Does OPA Bring To The Table?

How Does OPA Work?

magalix.com

Introducing Policy As Code: The Open Policy Agent (OPA)