Evil  David

Evil David

1585234440

Introducing Policy As Code: The Open Policy Agent (OPA)

What Is OPA?

It’s a project that started in 2016 aimed at unifying policy enforcement across different technologies and systems. Today, OPA is used by giant players within the tech industry. For example, Netflix uses OPA to control access to its internal API resources. Chef uses it to provide IAM capabilities in their end-user products. In addition, many other companies like Cloudflare, Pinterest, and others use OPA to enforce policies on their platforms (like Kubernetes clusters). Currently, OPA is part of CNCF as an incubating project.

What Does OPA Bring To The Table?

You may be wondering: How did OPA come about? What problems does it try to solve? Indeed, policy enforcement over APIs and microservices is as old as microservices themselves. There’s never been a production-grade application that didn’t enforce access control, authorization, and policy enforcement of some kind. To understand the role of OPA, consider the following use case: your company sells laptops through an online portal. Like all other similar applications, the portal consists of a front-page where clients see the latest offerings, perhaps some limited-time promotions. If customers want to buy something, they need to log in or create an account. Next, they issue payments through their credit cards or other methods. To make sure your clients repeatedly visit, you offer that they sign up for your newsletter, which may contain special discounts. Also, they may opt to receive browser notifications as soon as new products are announced. A very typical online shopping app, right? Now, let’s depict what that workflow would look like in a diagram to visualize the process:

Introducing Policy as Code- the Open Policy Agent (OPA)

The diagram above shows how our system might look internally. We have a number of microservices that communicate with each other to serve our customers. Now, obviously, Bob shouldn’t see any of the internal workings of the system. For example, he can’t view (or even know about) the S3 bucket where payments get archived, or which services the notification API can talk to. But, what about John? He’s one of our application developers and he needs to have access to all the microservices to be able to troubleshoot and debug when issues occur. Or, does he? What if he accidentally (or intentionally) made an API call to the database service to change the delivery address of the customer to somewhere else? Even worse, what if he had read permissions to the customers’ credit card numbers? To address those risks, we place an authorization control on top of each of our microservices. The control checks whether or not the authenticated user has the required privileges to perform the requested operation. Such an authorization system may be an internal, home-grown process or external as provided by AWS IAM. That’s how a typical microservices application is built and secured. But look at the drawbacks of using several assorted authorization systems especially as the application grows:

  • Modifying existing policies, or introducing new ones, is a nightmare. Just think of how many places you’ll need to visit to give Alice read access to all storage-related systems. This means S3, MySQL, MongoDB, and perhaps an external API to name a few.
  • There’s no way for developers to enforce policies on their own systems. They can obviously hardcode their authorization logic in the application, but that makes things intricately worse: trying to unify policies among different microservices is highly complicated.
  • Adding to the previous point, introducing a new policy to local services may require changing the code and, thus, introduce new versions of all the microservices.
  • What if you want to integrate policies with an existing user database? For example, integrating with the HR database.
  • We’ll need to visualize the policy to ensure that it’s doing what it’s supposed to do. This becomes of increasing importance as your policies get more complex.
  • Modern systems comprise multiple technologies and services which are written in different languages. For example, you may have the core of your system running on Kubernetes, and a bunch of legacy APIs that are not part of the cluster written in Java, Ruby, and PHP. Each platform has its own authorization mechanism.

Let’s look at Kubernetes as an example. If all users were authorized access to the entire cluster, lots of nasty things can happen such as:

  • Giving unlimited requests and limits to all the pods may cause random pods to get evicted from the nodes.
  • Pulling and using untested, haphazard images that may contain security vulnerabilities or malicious content.
  • Using Ingress controllers without TLS, allowing unencrypted, unsecured traffic to the application.
  • Numerous other unforeseen risks due to the overall complexity.

You can definitely use RBAC and Pod security policies to impose fine-grained control over the cluster. But again, this will only apply to the cluster. Kubernetes RBAC is of no use except in a Kubernetes cluster.

That’s where Open Policy Agent (OPA) comes into play. OPA was introduced to create a unified method of enforcing security policy in the stack.

How Does OPA Work?

Earlier, we explored the policy-enforcement strategies and what OPA tries to solve - that showed us the “what” part. Now, let’s now take a look at the “how.”

Let’s say that you’re implementing the Payments service of our example application. This service is responsible for handling customer payments. It exposes an API where it accepts payment from the customer. It also allows the user to query which payments were made by a specific customer. So, to obtain an array containing the purchases done by Jane, who is one of the company’s customers, you send a GET request to the API with the path /payment/jane. You provide your credential information in the Authorization header and send the request. The response would be a JSON array with the data you requested. However, since you don’t want just anyone with network access to have access to the Payments API to see such sensitive data, you need to enforce an authorization policy. OPA addresses the issue in the following way:

  1. The Payments API queries OPA for a decision. It accompanies this query with some attributes like the HTTP method used in the request, the path, the user, and so on.
  2. OPA validates those attributes against data already provided to it.
  3. After validation, OPA sends a decision to the requesting API with either allow or deny.

The important thing to notice here is that OPA decouples our policy decision from policy enforcement. The OPA workflow can be depicted in the following diagram:

Introducing Policy as Code- the Open Policy Agent (OPA) 2

OPA is a general-purpose, domain-agnostic policy enforcement tool. It can be integrated with APIs, the Linux SSH daemon, an object store like CEPH, etc. OPA designers purposefully avoided basing it on any other project. Accordingly, the policy query and decision do not follow a specific format. That is, you can use any valid JSON data as request attributes as long as it provides the required data. Similarly, the policy decision coming from OPA can also be any valid JSON data. You choose what gets input and what gets output. For example, you can opt to have OPA return a True or False JSON object, a number, a string, or even a complex data object.

#kubernetes #microservices #devops

What is GEEK

Buddha Community

Introducing Policy As Code: The Open Policy Agent (OPA)

Improving Kubernetes Security with Open Policy Agent (OPA)

Many multinational organizations now run their applications on microservice architecture inside their cloud environments, and (many) administrators are responsible for defining multiple policies on those environments. These giant IT organizations have extensive infrastructure systems and their systems have their own policy modules or their own built-in authorization systems. This is an excellent solution to a policy issue at enterprise scale (especially if you have the investment and resources to ensure best practice implementation), but such an overall ecosystem can be fragmented, which means if you want to improve control and visibility over who can do what across the stack, you would face a lot of complexity.

Why We Need OPA

Doing a lot of policy enforcement manually is the problem of the past. This does not work in today’s modern environments where everything is very dynamic and ephemeral, where the technology stack is very heterogeneous, where every development team could use a different language. So, the question is, how do you gain granular control over manual policies to automate and streamline their implementation? And the answer is with Open Policy Agent (OPA).

OPA provides technology that helps unify policy enforcement across a wide range of software and enable or empower administrators with more control over their systems. These policies are incredibly helpful in maintaining security, compliance, standardization across environments where we need to define and enforce such policies in a declarative way.

#blog #kubernetes #security #kubernetes open policy agent #opa #open policy agent #policy enforcement #policy implementation

Myriam  Rogahn

Myriam Rogahn

1599633600

GitHub Arctic Code Vault: Overview

Are you an Arctic Code Vault Contributor or have seen someone posting about it and don’t know what it is. So let’s take a look at what is an Arctic Code Vault Contributor and who are the ones who gets this batch.

GitHub, the world’s largest open-source platform for software and programs has safely locked the data of huge value and magnitude in a coal mine in Longyearbyen’s Norwegian town in the Arctic region.

Back in November 2019, GitHub Arctic Code Vault was first announced.

The GitHub Arctic Code Vault is a data repository preserved in the Arctic

World Archive (AWA), a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. The archive is located in a decommissioned coal mine in the Svalbard archipelago, closer to the North Pole than the Arctic Circle.

Last year, GitHub said that it plans to capture a snapshot of every active

public repository on 02/02/2020 and preserve that data in the Arctic

Code Vault.

The project began on February 2, when the firm took a snapshot of all of

GitHub’s active public repositories to store them in the vault. They initially intended to travel to Norway and personally escort the world’s open-source technology to the Arctic but their plans were derailed by the global pandemic. Then, they had to wait until 8 Julyfor the Arctic Data Vault data to be deposited.

GitHub announced that the code was successfully deposited in the Arctic Code Vault on July 8, 2020. Over the past several months, GitHub worked

with its archive partners Piql to write the 21TB of GitHub repository data to 186 reels of piqlFilm (digital photosensitive archival film).

GitHub’s strategic software director, Julia Metcalf, has written a blog post

on the company’s website notifying the completion of GitHub’s Archive Program on July 8th. Discussing the objective of the Archive Program, Metcalf wrote “Our mission is to preserve open-source software for future generations by storing your code in an archive built to last a thousand years.”

The Arctic Code Vault is only a small part of the wider GitHub Archive

Program, however, which sees the company partner with the Long Now

Foundation, Internet Archive, Software Heritage Foundation, Microsoft

Research and others.

How the cold storage will last 1,000 years?

Svalbard has been regulated by the international Svalbard Treaty as a demilitarized zone. Home to the world’s northernmost town, it is one of the most remote and geopolitically stable human habitations on Earth.

The AWA is a joint initiative between Norwegian state-owned mining company Store Norske Spitsbergen Kulkompani (SNSK) and very-long-term digital preservation provider Piql AS. AWA is devoted to archival storage in perpetuity. The film reels will be stored in a steel-walled container inside a sealed chamber within a decommissioned coal mine on the remote archipelago of Svalbard. The AWA already preserves historical and cultural data from Italy, Brazil, Norway, the Vatican, and many others.

What’s in the 02/02/2020 snapshot?

The 02/02/2020 snapshot archived in the GitHub Arctic Code Vault will

sweep up every active public GitHub repository, in addition to significant dormant repos.

The snapshot will include every repo with any commits between the announcement at GitHub Universe on November 13th and 02/02/2020,

every repo with at least 1 star and any commits from the year before the snapshot (02/03/2019 – 02/02/2020), and every repo with at least 250 stars.

The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100KB in size—depending on available space, repos with more stars may retain binaries. Each repository will be packaged as a single TAR file. For greater data density and integrity, most of the data will be stored QR-encoded and compressed. A human-readable index and guide will itemize the location of each repository and explain how to recover the data.

The company further shared that every reel of the archive includes a copy

of the “Guide to the GitHub Code Vault” in five languages, written with input from GitHub’s community and available at the Archive Program’s own GitHub repository.

#github #open-source #coding #open-source-contribution #contributing-to-open-source #github-arctic-code-vault #arctic-code-vault #arctic-code-vault-contributor

Tyrique  Littel

Tyrique Littel

1604008800

Static Code Analysis: What It Is? How to Use It?

Static code analysis refers to the technique of approximating the runtime behavior of a program. In other words, it is the process of predicting the output of a program without actually executing it.

Lately, however, the term “Static Code Analysis” is more commonly used to refer to one of the applications of this technique rather than the technique itself — program comprehension — understanding the program and detecting issues in it (anything from syntax errors to type mismatches, performance hogs likely bugs, security loopholes, etc.). This is the usage we’d be referring to throughout this post.

“The refinement of techniques for the prompt discovery of error serves as well as any other as a hallmark of what we mean by science.”

  • J. Robert Oppenheimer

Outline

We cover a lot of ground in this post. The aim is to build an understanding of static code analysis and to equip you with the basic theory, and the right tools so that you can write analyzers on your own.

We start our journey with laying down the essential parts of the pipeline which a compiler follows to understand what a piece of code does. We learn where to tap points in this pipeline to plug in our analyzers and extract meaningful information. In the latter half, we get our feet wet, and write four such static analyzers, completely from scratch, in Python.

Note that although the ideas here are discussed in light of Python, static code analyzers across all programming languages are carved out along similar lines. We chose Python because of the availability of an easy to use ast module, and wide adoption of the language itself.

How does it all work?

Before a computer can finally “understand” and execute a piece of code, it goes through a series of complicated transformations:

static analysis workflow

As you can see in the diagram (go ahead, zoom it!), the static analyzers feed on the output of these stages. To be able to better understand the static analysis techniques, let’s look at each of these steps in some more detail:

Scanning

The first thing that a compiler does when trying to understand a piece of code is to break it down into smaller chunks, also known as tokens. Tokens are akin to what words are in a language.

A token might consist of either a single character, like (, or literals (like integers, strings, e.g., 7Bob, etc.), or reserved keywords of that language (e.g, def in Python). Characters which do not contribute towards the semantics of a program, like trailing whitespace, comments, etc. are often discarded by the scanner.

Python provides the tokenize module in its standard library to let you play around with tokens:

Python

1

import io

2

import tokenize

3

4

code = b"color = input('Enter your favourite color: ')"

5

6

for token in tokenize.tokenize(io.BytesIO(code).readline):

7

    print(token)

Python

1

TokenInfo(type=62 (ENCODING),  string='utf-8')

2

TokenInfo(type=1  (NAME),      string='color')

3

TokenInfo(type=54 (OP),        string='=')

4

TokenInfo(type=1  (NAME),      string='input')

5

TokenInfo(type=54 (OP),        string='(')

6

TokenInfo(type=3  (STRING),    string="'Enter your favourite color: '")

7

TokenInfo(type=54 (OP),        string=')')

8

TokenInfo(type=4  (NEWLINE),   string='')

9

TokenInfo(type=0  (ENDMARKER), string='')

(Note that for the sake of readability, I’ve omitted a few columns from the result above — metadata like starting index, ending index, a copy of the line on which a token occurs, etc.)

#code quality #code review #static analysis #static code analysis #code analysis #static analysis tools #code review tips #static code analyzer #static code analysis tool #static analyzer

Panmure  Anho

Panmure Anho

1591751249

How to Integrate Open Policy Agent (OPA) with Kubernetes

In this article, we’re going to explore how we can integrate OPA with Kubernetes and see some examples of the power that this integration can bring to policy enforcement in your environment.

OPA is deployed to Kubernetes as an admission controller. If you’re not familiar with admission controllers, let’s spend a few moments discussing their role.

#kubernetes #devops #opas #open-policy-agent

Justice  Reilly

Justice Reilly

1593471420

Enforce Pod Security Policies In Kubernetes Using OPA

we’re going to demonstrate how you can enforce the most fine-grained security policies using OPA.

#devops #kubernetes #k8s #opa #open policy agent #policies